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EDITOR’S PREFACE 

This book was commenced in 1939. The actual writing was 
completed in 1941 and the first proofs were available during the 
next year. Owing however to war conditions and pressure of other 
work the editing took a considerable time and it was not until the 
summer of 1945 that there seemed to be any prospect of early 
publication. 

Of the book itself I can only say that it supplies a long-felt 
want. Mr Tetley’s knowledge of that part of actuarial statistics 
covered by this volume is far-reaching and as far as the subject- 
matter is concerned my editorial functions have been reduced to a 
minimum. 

The task of editing the book has been a pleasurable one. On 
many occasions it has been the means of alleviating the monotony 
of war-time duties. Many hours which would otherwise have been 
weary have been agreeably spent both on reading the manuscript 
and correcting the proofs. 


H. F. 




AUTHOR’S PREFACE TO THE 
SECOND EDITION 

Mistakes in the first edition have been corrected and the 
opportunity has been taken to re-write certain sections, particularly 
in Chapter iv. 

In actuarial work the probabilities involved, such as those of 
death or falling sick, are usually very small, while the numbers 
exposed to risk are large. The Poisson distribution is therefore 
particularly appropriate in such work, and a demonstration of its 
derivation from the binomial has now been included in place of one 
of the developments of the normal approximation. 

Although only the theory of large samples is considered it was 
felt that the notion of biased and unbiased estimates was sufficiently 
fundamental to require a brief explanation, and a section has been 
included in Chapter iv. 

Table I of the Appendix based on the normal curve was formerly 
taken from Hardy’s book, but in order to secure uniformity with 
the projected new text-books and Short Collection of Actuarial 
Tables, it has been replaced by the more usual type of table which 
will appear in the new publications. 

The author is glad of this opportunity of thanking all those who 
have assisted him by helpful criticism and by drawing attention to 
mistakes in the earlier edition. 


H. T. 



AUTHOR’S PREFACE TO THE 
FIRST EDITION 


As THE TITLE implies, no attempt has been made to produce a 
statistics text-book for general use. Many excellent works of this 
kind are already in existence (some are mentioned in the Biblio¬ 
graphies), but, until he has qualified, the actuary rarely has time at. 
his disposal to make a thorough study of the subject of statistics. 
It is hoped that this book will give the reader a grasp of the funda¬ 
mental ideas of statistics, which will not only enable him to examine 
critically the various tables with which he has to deal, but will help- 
him to develop a ‘statistical sense’ so that he can take a lively and 
intelligent interest in developments outside, as well as inside, the 
actuarial world. 

Some subjects have inevitably been dealt with in a rather 
superficial way and much interesting material has had to be ex¬ 
cluded from the book, which may consequently present an un¬ 
balanced appearance to the professional statistician. Few readers 
will be content, therefore, to restrict their statistical studies to this 
volume, but the introduction which it provides should enable more 
ambitious works to be read with greater profit. 

Mathematics has been given rather more prominence than is 
usual in statistical works, because it has been found that actuarial 
students find this form of treatment interesting and stimulating. 
The chapters on graduation reflect the theoretical complexity of 
the methods rather than their practical importance. Thus, gradua¬ 
tion by a summation formula, because of its complicated analytical 
basis, has required much fuller treatment than the graphic method, 
which is more important from the practical point of view. 

Chapter I should help the student to revise what he has learned 
in the Statistics chapters of Mathematics for Actuarial Students 
by extending to grouped data the technique he has already used 
for finding means, standard deviations and similar measures of 
location and dispersion. Chapter II deals with the two standard 
frequency distributions of most importance to actuaries, but the 
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Poisson distribution has been excluded both in the interests of 
brevity and because it proves rather a * blind alley* in the present 
state of our knowledge. At the end of Chapter III some elementary 
ideas of non-linear regression and spurious correlation have been 
introduced because of their fundamental importance, although a 
satisfactory treatment of these subjects is quite outside the scope 
of the book. 

Chapter IV is probably the most important because it is not until 
he understands the way in which sample results can be used that 
a student grasps the idea of ‘statistical inference* which underlies 
all modern scientific methods. 

It may seem somewhat illogical to explain tests of a graduation 
before the methods of graduation themselves, but Chapter V to a 
large extent develops from the sampling technique described in the 
previous chapter, while the process of graduation becomes more 
intelligible when the criteria of smoothness and goodness of fit are 
already appreciated. The outline of the ^ test is incomplete but the 
choice seemed to lie between dealing with it in this way and 
omitting all reference to it. A thorough treatment would involve 
a description of contingency tables and the multivariate normal 
frequency distribution, which would have greatly increased the 
scope of the whole book. 

In each of the remaining chapters standard tables have been used 
as examples of the methods described, and it is hoped that the 
student may thus be saved a good deal of reference to original 
papers and memoranda which he has previously been obliged to 
consult. 

The bdok has grown out of the lesson notes prepared by the 
tutors for the appropriate section of the examinations of the 
Institute of Actuaries and the Faculty of Actuaries. The author 
particularly wishes to acknowledge his indebtedness to Messrs 
A. T. Haynes and O. C. J. Klagge, who prepared the original 
set of notes when the Actuarial Tuition Service was inaugurated, 
and thus provided a framework, which with some modification of 
detail has remained virtually unchanged ever since. Mr H. W. 
Haycocks has also assisted greatly by informed criticism and many 
valuable suggestions. 



Xvi AUTHOR*S PREFACE 

Finally my grateful thanks are due to Mr H. Freeman, who has 
not only helped to ensure a logical development of the subject from 
the chapters in Mathematics for Actuarial StudentSy but has co¬ 
ordinated the two parts of this present book at a time when con¬ 
ditions made it impossible for the authors to meet. The book also 
owes much to his experience and care in reading and checking the 
proofs and in seeing them through the press. ^ 

H. T. 
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THE IMPORTANCE OF STATISTICS 
TO THE ACTUARY 

In studying statistics the student is usually handicapped by lack of 
practical background and is puzzled by the apparent uselessness of 
the science in actuarial work. It is hoped that the following remarks 
will be of assistance in clearing up his difficulties and giving a general 
survey of the scope of the book. 

In our complex modem civilization we are constantly coming 
across enumerations and records of measurements, even if they 
are no more complicated than new business returns and details of 
claims. Owing to the limitations of the human mind some sort of 
classification and grouping is almost always essential in order to 
reduce them to comprehensible dimensions. Without a knowledge 
of statistics this classification and grouping may be carried out so 
that a misleading impression is conveyed, or, as more frequently 
happens, quite incorrect deductions may be made from data which 
have been collected and properly analysed. 

Perhaps the most important aspect of statistics from the point of 
view of the actuary is that of reliability of results. For instance, 
a value of derived from an “exposed to risk*’ of ten lives is, 
ceteris paribus, less reliable than one based on a hundred, and the 
Theory of Sampling dealt with in Chapter IV enables us to obtain 
a measure of this relative reliability. 

This question is of particular importance in making a graduation 
and in considering the resulting values. The rates of mortality, sick¬ 
ness, retirement etc. derived from observations are not suitable for 
use until they have been graduated. This process may be said to be 
an attempt, by eliminating random errors from the observed data, to 
arrive at the true rates which would be obtained from ideal data of 
unlimited extent. The graduated rates may differ appreciably from 
the ungraduated rates and the question of reliability thus arises. 
Clearly a rate of mortality based on a thousand lives exposed to risk 
should be fairly close to the estimate of the true rate as shown by 

FMASiii z 
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the graduated table, but if the exposed to risk is only fifty we should 
not be surprised to find the graduated rate differing considerably 
from the observed rate. To what extent are we justified in departing 
from the observed data and functions based on them in deriving 
the graduated values? To answer this question we again need to 
understand the Theory of Sampling. 

Chapters I and II are concerned with general statistics and deal 
in somewhat greater detail with matters which the student will 
already have met in Mathematics for Actuarial Students. Chapter III 
on Correlation is included chiefly for the sake of completeness, as 
this subject is one more for the economist and the professional 
statistician than for the actuary. 



CHAPTER I 


CONTINUOUS AND DISCONTINUOUS 
VARIABLES. GROUPED DATA 

1. Attributes and variables. 

Before data derived from observations and measurements are of 
any practical use they have to be reduced to manageable proportions 
by classification and usually by grouping. The word “statistics” is 
in fact usually restricted to “arranged or classified facts”. Such 
arrangement or classification can be made only with reference to 
some factor which varies from individual to individual and is 
capable of assessment. When this factor is capable of measurement, 
e.g. height, age, sum assured, etc., it is called a variable and may 
be continuous or discontinuous. 

Continuous and discontinuous variables are constantly occurring 
in the mathematical work with which the student has previously 
had to deal and the words are used in the same sense in statistics. 
The phrase “discrete variation” is also fairly common instead of 
“discontinuous variation”. In statistics relating to housing con¬ 
ditions the number of rooms is an important feature and is an 
example of discrete variation, since only integers are permissible. 

When the factor used for classification is not capable of measure¬ 
ment it is called an attribute. Typical examples of attributes are 
nationality, class of policy, colour of eyes. 

In this book we shall be concerned chiefly with variables, and 
when this theory has been mastered the student who is interested 
should find little difficulty in the statistics of attributes, although a 
different technique has to be developed. 

2. Continuous and discontinuous variables. 

When the variable is discontinuous the data are automatically 
divided into watertight compartments, although grouping may be 
used to reduce the statistics to more manageable proportions. For 
instance^ if the results of a School Certificate examination were to 
be tabulated to show for a particular subject how many candidates 
obtained o, i, 2, ... up to 100 marks, the discontinuous variable 
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(marks) would itself divide the data into loi separate divisions 
which might be later aggregated into groups of five or ten marks 
together.’* When the variable is continuous, however, the data are 
necessarily grouped, although this may not be obvious from the 
way in which they are published. For instance, an office will have 
tabulated the numbers of whole-life with profit policies on lives 
aged i8,19, 20,..., and although this may appear to be an example 
of discontinuous variation the lives are actually grouped, the 
commonest method being to combine all lives who have a birth¬ 
day between the ist July of one year and the 30th June of the next. 
Thus, those born on ist July 1908 or 30th June 1909 or any inter¬ 
mediate date would be grouped and treated as aged 32 on 31st 
December 1940. 

It is usual in classifying data involving a continuous variable to 
use values of this variable at equal intervals so that the frequencies 
in the various groups are comparable. The interval used for classi¬ 
fication is termed the class-interval. The data so classified are said 
to form 2i frequency distribution with equal class intervals. 

3 . Histograms and frequency curves. 

Where the variation is discrete we know the frequencies with 
which the values occur and can represent these frequencies by a 
series of points or by a series of ordinates as shown in Fig. 7, p. 256 
of Mathematics for Actuarial Students^ Part II. When the variable 
is continuous, however, the data are grouped and we know only 
the frequency with which values between and x^y say, occur. 
An ordinate at the mid-point between Xy^ and X2 is not a very satis¬ 
factory way of representing the frequency, and the most usual way 
is to erect a rectangle with base from Xy^ to x^ and area proportionate 
to the frequency. If the class-interval is constant it is of course 
immaterial whether we regard the heights or the areas of the 
rectangles as representing the frequencies, but the method can be 
used with advantage when this is not the case. 

The series of rectangles is called a histogram. 

Suppose, for instance, that we are given the following frequency 
distribution for the continuous variable ‘ age ^ 

* In such cases a common convention is to include the frequency for the 
value Xi but not for in the group ac,". Similarly the frequency for the 
value Xt is included in Other conventions are frequently adopted. 
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Age last birthday 

Number of deaths 

30-39 

7,408 

40-49 

9,482 

50-59 

i3»953 

60-69 

20,865 

70-79 

23,990 

80-89 

12,714 

90-99 

1,269 


These results could be represented by a histogram thus: 



Ten years is a fairly wide class-interval to use and as a result . 
the ‘‘steps’’ of the histogram are rather large, particularly above 
age 70, thus producing an irregular outline. If, instead, the class- 
interval had been unity, the process of drawing seventy rectangles 
would have been laborious; the outline would however have been 
much smoother, particularly over the range 70--90. 

The data for this range for unit intervals are given below: 


Age 

Deaths 

Age 

Deaths 

70 

2434 

80 

2018 

71 

2468 

81 

1873 

72 

2490 

82 

1712 

73 

2496 

83 

1540 

74 

2487 

84 

1361 

75 

2459 

85 

1180 

76 

2412 

86 

1002 

77 

2343 

87 

830 

78 

2255 

88 

671 

79 

2146 

89 

527 
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These are actually the values of in the Table (Makeham 
Graduation), and a histogram based on these values would involve 
groups with unit class-interval, each group containing the deaths 
between one integral age and the next. 

We can, however, go further and draw a continuous curve repre¬ 
senting the limiting form of the histogram when the class-inter¬ 
val is indefinitely reduced. Such a curve (known as a frequency 
curve) can be used to find the frequency with which values be¬ 
tween any limits and x^, say) will occur. This is represented 
by the area bounded by the curve, the ;»-axis and the ordinates at 
*1 and x^. If the ordinate of the curve is represented by /(«) it is 
not correct to say that /(*) represents the frequency with which the 
value X will occur. All we can say is that values between x and 
*+A* will occur approximately with frequency/(*) A* if A* is 
very small. 

It is important to realize that when the variable is continuous 
there is no such thing as the frequency with which a certain 
value X will occur exactly. Some range of values, however small, 
is always understood, although not always referred to explicitly. 
In a darts match we could record the frequency with which 17 
was scored, but if we tried to record the frequency with which 
1000 hens laid eggs weighing 2| oz. we should have to decide what 
range of values was to be included. We might decide to include 
all weights from 2^ oz. to zf oz. or from one-thousandth of an 
oimce below to one-thousandth of an ounce above zj oz. However 
small the interval was made it would still exist and would be 
dependent on the degree of accuracy with which the weighing could 
be carried out. 

In the above example /(«) A» gives the number of deaths occur¬ 
ring between ages x and «-hA«; /(«) is therefore not a function 
of dg but is of the form 

In practice, of course, we are usually able to obtain only the 
grouped data and may have to estimate as best we can the shape of 
the frequency curve which would be derived if the class-interval 
were indefinitely reduced and the numbers of groups accordingly 
increased. This problem will be dealt with later in this book, but it 
may be said at once that the method adopted is to start with a 
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mathematical curve, the equation of which involves one or more 
constants, and to determine what values of these constants give the 
best “fit” to the given data. It should not be assumed, however, 
that a frequency curve to fit any frequency distribution can be 
arrived at by a priori reasoning from the data; a fairly good fit can 
usually be obtained, however, by empirical methods. 

In dealing with grouped data it is usually sufficient to assume 
that the total frequency for any group is concentrated at the mid¬ 
point of that group, but in the following sections we shall deal with 
several instances for which such an assumption may not be suffi¬ 
ciently accurate. 

4. Moments of a frequency distribution. 

If we have a discontinuous or discrete variable which takes the 
values JCi, ^3,with frequencies/i,/2, (total frequency 
iV), the rth moment about the origin is defined as 

= .(l) 

If the origin is the mean of the distribution, we shall speak of 
“moments about the mean” and denote them by fi^ instead of m^. 

It is a simple matter to convert moments about one origin to 
moments about the mean (x = M). 

If we denote by the distance x^—M^ we have, using the above 
notation, 

rth moment about the origin 

2 .(2) 

But fty, the rth moment about the mean, 
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Hence the above equation can be written 

.(3) 

Similarly it can be shown that 

.(4) 


But it will be remembered that M, the mean, is defined by 
^{*1/1+*2/2+ — + *n/n} above, 
and the standard deviation, a, is defined by the equation 

‘r“=;^{^f/i+^/2+...+M=M2. 

Hence, putting r=i in equation (3), we have 
OTj=jtti+M; giving =o, 

i.e. the first moment about the mean is zero. 

Similarly, putting r=2, we have 

= [i^+M\ since /ii=o; 

or = .(5) 

Therefore, to find <r®, we take the second moment about any con¬ 
venient origin and subtract the square of the distance between this 
origin and the mean. This is of course the method with which the 
student is already familiar. 

Again, putting r=3, we have from equation (4) 

/*3=wig—3 Afffia+3 A/%1—Af® 
=»i8-3OTi»ia-l-2wjf; 

and /*4=»t4-4miOT8+6OT5OT2—3»if. 

In the same way we can find iris in terms of /x’s from equation (3). 

Note. The use of /u to denote moments about the mean is very common 
in statistical literature, and although in theoretical work confusion may 
arise with the force of mortality (and similarly}», may be confused with 
the central rate of mortality) in numerical work this is unlikelyto arise, as 
moments of higher order than the fourth are hardly ever met with. 

For continuous variables we shall denote by/(*) the ordinate to 
the frequency curve at the point x. 
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As Stated above we can assume the frequency between x and 
«+Ax to be approximately /(*)A*. 

Hence in the limit 

.( 6 ) 

where A/'=total frequency 

=j^fix)dx. 

5 . Sheppard’s adjustments. 

If we assume the total frequency of any group to be concentrated 
at the mid-point of that group, errors arise in the calculation of 
some of the moments. A method of making adjustments in certain 
circumstances is due to Dr W. F. Sheppard. 

Let us consider a group for which the variable lies between 

and -(class-interval A). 

Let the frequency curve stretch from x^s^ato x — b and, as before, 
let the ordinate at the point x be f{x). 

The true rth moment tn^ is then 

I 

ivj (Ar= total frequency). 

The approximate rth moment obtained by assuming the total 
frequency for each group to be concentrated at the mid-point is 
given by an expression of the form 
. I 

f{Xi + z) dz, .(7) 

*/ --A/2 

where the summation embraces all groups. 

By Taylor’s theorem, 

( ~2n2 \ 

/(«:, + »)= jn-arD + —j-+-JJ- + 

where D s ~, 

{IXf 

••• /(*<+») = hf{x ,)+^/» (*,) + (x,) + .... 

0 .(8) 
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Hence, from (7), 

.(9) 

It will be remembered that the Euler-Maclaurin expansion 

expresses J f(x) dx in terms of S/(*) at intervals of h and terms 

involving the values of/(*) and its derivatives at the limits a and b. 
If the frequency curve has contact of a high order with the *-axis 
at both ends^—^i.e. if/(*) and its first few derivatives are zero at the 
limits *=a and x=b —^then 

ASa:j’/(»,)=J i^f{x)dx. 


A® A* 

1920 1920 


f }ff^''{x)dx. 

J a 


The second term can be integrated by parts as follows; 

Since/(*) and its derivatives vanish at both limits this reduces 
to the last term. 

I 

Similarly, the term J if ^ be transformed into 





Sheppard’s adjustments ii 

In this way the expression for can be transformed to 

A*r(r—i) A* , . 

where the w’s are true moments. 

Hence in'^=tn^ 

, A* 

'«2 = »« 2 + — 

/ A® 

• fR3 = fft,-| WIi 

4 

' A® 

W 4 = «4 + -«2 + ^ 


From these we have successively that 
A2 


- 


12 


.(lo) 


A® 

OTg=OTg-mi (since m^=mj) 

4 

, A® , 7A‘ 

*”4 = »*4-+ i- 

2 240 

For moments about the mean, and /x, need no correction 
(/[ii=ji*i=o), so that: 


/*2=f^a- 


12 


. A® . 7A® 


.(11) 


where /x' represents a moment about the mean calculated on the 
assumption that the total frequency of each group is concentrated 
at the mid-point of that group. 

'Qxe assumption made above that/(«) and its derivatives vanish 
at a and b means that in practice Sheppard’s adjustments should not 
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be made unless the frequency dwindles to nothing at each end of 
the range considered and has negligible first and second differences 
as these ends are approached. (It should be borne in mind that we 
can usually only estimate the shape of the frequency curve from the 
given data.) 

The most common adjustment is the deduction of A®/i2 in finding 

(=f^) snd this will be made in the example to be found later in 
the chapter. It is doubtful, however, whether the correction is 
worth making unless the data are fairly extensive (say, a total fre¬ 
quency of at least looo), because errors of sampling would other¬ 
wise be large compared with errors of grouping. 

6. Mode of grouped data. 

Theoretically the only sound way of finding the mode of grouped 
data when the variable is continuous is to fit a frequency curve to 
the data and then find when the gradient dyjdx vanishes. Usually, 
however, the mode is not required to a degree of accuracy which 
would justify the considerable labour involved. 

The work can be done graphically by drawing a histogram and 
replacing it by a smooth curve from which the mode can be found 
by inspection. Here again the work is fairly heavy, but the method 
has the great advantage that all the data are used. 

(It should be remembered, however, that there may be more than 
one mode, each of which corresponds to a local maximum ordinate 
on the frequency curve.) 

Of the short analytical methods perhaps the best is to fit a curve 
of the form f {x) = a+bx+cx^+... to the data in the immediate 
neighbourhood of the mode. This is usually fairly simple, but 
unfortunately only some of the data are used (see Example, para. 8). 

7. Mean deviation of grouped data. 

This section relates to the mean deviation which is the average of 
the absolute magnitudes of deviations from the arithmetic mean. 

The group in which the mean lies presents difficulty when the 
mean deviation has to be calculated and this group is usually left 
until the last. For convenience we shall call it the special group. 
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The work is divided into three stages: 

(1) Ignoring the special group the total in each group is assumed 
to be concentrated at the mid-point of the group and all distances 
are measured from the origin, which is, of course, chosen so as to 
reduce the arithmetical work. 

(2) The result is adjusted so as to allow for all distances being 
measured from the mean (still ignoring the special group), 

(3) The mean deviation of the special group about the mean is 
found and an appropriate adjustment made to the result of (2). 

Denote the group frequencies by 

W—y+U 2^0, Ut^y ..., Wg—l) 

and the distances of the mid-points from the origin by 

X^^y ... X_^y X^y X-^y X^^f ... Xg_^y Xg. 

The products u_^x_^y u^r-hiX_r+r> — ^k-i^k-v ••• 

(ignoring the special group term Uj^Xj^) are calculated and summed, 
treating all signs as positive. 

M 

In the illustration O represents the origin and G the position of 
the mean, distant M from the origin. Pj represents the mid-point 
of a group with frequency Uj lying to the right of O, while 
represents the mid-point of a group with frequency «_{lying to 
the left of O. 

OPj=Xj and OP_,=»_^. 

We have calculated UjX^, but to find the mean deviation about 
the mean we require 

Uj.GPj—Uj{Xj—M) 

=UjXj—MUj. 

The same applies to all groups to the right of O, so that in finding 
the mean deviation about the mean the total frequency for these 
groups must be multiplied by M and subtracted from the previous 
result. 



STATISTICS 


H 

Similarly, the teim I I must be replaced by «_j.P_^G,i.e. by 

• Dealing with all the groups to the left of O in the same way we 
see that we have to adjust the previous result by adding M x (total 
frequency in the groups to the left of O). 

Summing up, we may say that the process referred to as (2) above 
can be reduced to: 

multiplying the total frequency in groups to the right of O by Af 
and subtracting the product from the result of the first process; and 

multiplying the total frequency in groups to the left of O by Af 
and adding the product to the result of the first process. 

The student should investigate the problem when G is to the 
left of O and is recommended, when in doubt, to make a drawing 
similar to the above to ensure that, in making the adjustment, the 
signs are correct. 








Finally, the special group has to be dealt with. 

As before, G represents the mean and AB the limits of the special 
group: 

AG=a and GB=b. 

If we assume that the group frequency «*, is evenly spread over ' 
the range, the frequency to the left of G will be the 

frequency to the right will be «*. 

We can now assume that the first of these frequencies is con¬ 
centrated at the mid-point of AG and the second at the mid-point 
oiGB. 

The special group thus contributes a term 

a a , b b _ a^+h^ 
a+6 ■ 2 a+6 ***' 2 ~ 2(a+^ ***' 

This is added to the previous result, and on dividing by the total 
frequency we obtain the mean deviation about the mean. 



ILLUSTRATIVE EXAMPLE 15 

This seems a somewhat laborious process, but the numerical 
work is relatively simple, as will be seen in Example 2. 

8. Illustrative examples. 

The following example will serve to remind the student of what 
he has already learnt in Mathematics for Actuarial Students, It 
does not involve a continuous variable or grouped data. 

Example 1. 

The table below gives a frequency distribution of the scores returned 
in a veterans* golf competition for which there were 1000 competitors: 


Score 

Frequency 

Score 

Frequency 

70 

16 

76 

83 

71 

93 

77 

56 

72 

181 

78 

38 

73 

196 

79 

26 

74 

163 

80 

17 

75 

120 

- 81 

II 


Determine the values of the mean, median, mode, quartile deviation, 
mean deviation and standard deviation. 

To save arithmetic, measure deviations from the.score 74 instead of 
from o or 70. 


Value of 
variable 

X 

(I) 

Frequency 

/, 

(a) 

Deviation 
from 74 

x -74 

(3) 

(*-74) X/. 

( 2 )x( 3 ) 

( 4 ) 

(*- 74 )*x/. 

( 3 )x( 4 ) 

( 5 ) 

Cumulative 

frequency 

2/. 

(6) 

70 

16 

-4 

+ 

64 

256 

16 

71 

93 

-3 

279 

837 

109 

72 

181 

-2 

362 

724 

290 

73 

196 

— I 

196 

196 

486 

74 

163 

0 

— 

— 

649 

75 

120 

I 

120 

120 

769 

76 

83 

2 

166 

332 

852 

77 

56 

3 

168 

504 

908 

78 

38 

4 

152 

608 

946 

79 

26 

s 

130 

650 

972 

80 

17 

6 

102 

612 

989 

81 

II 

7 

77 

539 

1000 

\mmm 



-9014-915 


— 

■i 



, =4-14 

1 5378 
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Mean =74+^=74-014. 

Second moment about origin (74)== 5*378. 

Distance of origin from mean = *014. 

.% (standard deviation)* = 5*378 - (*014)% 
so that 0 = 2*32. 

Mean deviation from the mean. Total deviation from the origin (irre¬ 
spective of sign) = 90i+915 = 1816. We must, however, measure from 
74*014 ixistead of from 74. 

All frequencies for values of x greater than 74 must therefore be 
multiplied by *014 and nthtracted^vibilt all frequencies for values of x 
of 74 or less must be multiplied by *014 and added. 

We have i8i6 + -oi4(649 - 3Si) = i820. 

mean deviation from the mean== i *82. 

Median. As there are 1000 observations the median will lie between the 
500th and the 501st. The nearest integral value of x satisfying this con¬ 
dition is 74, which may be taken as the median. 

Lower guartile. A quarter of the total frequency is 250, but it cannot 
be said Aat one-quarter of the total frequency lies below the 250th 
observation (nor for that matter does three-quarters lie above the 251st 
observation). The lower quartile lies between the 250th and 251st 
observation and the nearest integral value of x satisfying this condition 
is 72. 

Upper quartile. Similarly, the upper quartile separates the 7Soth and 
751st observation and may be taken as 75. 

75 — 72 

Quartile deviation is therefore =1*5. 

Mode. The value of x having the greatest frequency is 73. 

Example 2. 

If in the above example the frequency distribution relates, not to the 
scores but to the ages nearest birthday of the competitors, what alterations 
will there be in the values of the respective indices.? 

Here we are dealing with a continuous variable and the data are 
grouped. Thus the frequency 181 shown opposite jc=72 really relates 
to ages from 71J to 72 J (class-interval unity). 

Mean. Assuming the frequencies to be concentrated at the mid-points 
of the groups we find that the previous calculations still hold good, and 
the mean is 74*0x4 as before (no Sheppard’s adjustment is required to the 
first moment). 
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Standard deviation. Making the Sheppard’s adjustment (— ^), we have 
(standard deviation)^ = 5*378 - (*014)^ ~ 

.*. a = 2*30 years. 

Mean deviation from the mean. Ignoring the special group (73 J-74J, or 
nearest age 74) we have the sum of the deviations from 74, irrespective 
of sign, 1816, as before. 

In order to obtain the deviations from the mean (74-014) all frequencies 
in the groups for ages greater than 74J must be multiplied by -014 and 
subtracted from the result. Similarly, all frequencies for ages less than 
73 J must be multiplied by -014 and added. 

This gives i8i6 + *oi4(486~35i). 

Finally, we must deal with the group 73 ^-74^, with frequency 163. 

-514---486- 

I-^-1 

73-5 74-014 74-5 

Of this frequency *514 x 163 are assumed to lie between 73-5 and the 
mean and -486 x 163 between the meafl and 74-5. 

Each of these sub-groups may be assumed to be concentrated at the 
mid-point of its own range, thus giving a term 

163 1 'S 14 X -^ + -486 X -^1 
to be added to the above. 

Note. For the special group it is simpler to measure deviations not first 
from 74 but from the mean direct and afterwards to apply a correction. 

The mean deviation therefore becomes 

Toso [1816 + -014(486 -351) (-514* + -486®)] = I-86 years. 

Median. As before, this is a value of x lying between the 500th and 501st 
observations. These lie in the group 73^-74!, which includes 163 
observations, while the total frequency for lower values of x is 486. 

We assume that the 163 observations between 73 J and 74J are evenly 
spread over the interval at distances ^2"6» ••• fit either 

end, i.e. they occur for values of x: 

The first of these observations is the 487th counting from the lowest 
values of x, the next is the 488th, and so on. 

The 500th and 501st observations are the 14th and 15th in that particular 
group correspond therefore to values of ^ of 73 Jand 73 J 

The median may be taken therefore as 73i+^^=73’59 years. 

FMAsiii 
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Lower quartik. 109 values lie below the group 71^-72 which itself 
includes 181 observations. These x8i observations are assumed to 
correspond to values of 2; of 

(’■*-*)• ... (7.Hg). 

SO that the 250th and 2Sist observations (141st and 142nd in that group) 
correspond to values of x: 

+ and 71^+^. 

The lower quartile may therefore be taken to be 7i|+f|f=72*28 years. 

Upper quartile. The 750th and 751st observations are the loist and 
102nd in the group 74^-75 J. 

Hence the upper quartile is 74^ =75*34 years. 

Quartile deviation, i (75*34 ~ 72*28)= 1-53 years. 

Mode, The student is recommended to try the graphic process of 
drawing the histogram, sketching in a smooth frequency curve and 
finding the mode by inspection. 

An anal3rtical method similar to the following is sometimes useful: 
Assume that the frequency curve in the neighbourhood of the mode 
is of the loxm y^a^-bx-^-cx^. 

Taking the origin at 73 we have the following equations from which 
to find ay b and cx 

f {a-\rbx-\- cx^) = 181 (the frequency in the group 7ii“72J); 

V li 

i.e. a-b+—^iii. 

12 


Similarly 

and 



From these we obtain 6 = - 9, c = — 24. 

The mode (the value of x for the maximum ordinate) is given by 



i.e. b-hzcx^^o. 


. ® 9 ^ . 

.. «= referred to 73 as origin. 

The mode is therefore 73 -^=72*8i years. 
Note: all the values calculated are values of x. 
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9. Skewness. 

If we imagine a frequency distribution represented by a smooth 
frequency curve, the measures already demonstrated will tell us a 
great deal about the curve. We know, for instance, where its highest 
point occurs (mode), whether it is like a steep-sided peak (small 
standard deviation) or a broad plateau (large standard deviation). 
There is a further characteristic in which we are sometimes in¬ 
terested, namely, its lack of symmetry or “skewness*^ 



Positive skewness. 



Negative skewness. 


In a symmetrical curve the mean, median and mode all coincide, 
and the extent to which they fail to do so gives rise to one well- 
known measure of skewness, viz. 


mean — mode _ 3 (mean — median) 

standard deviation ""standard deviation 


(approx.). 


.(*2) 

For reasons which will be discussed later, the approximate 
measure should be used only if the skewness is relatively small. 

A second measure sometimes used is 


(upper quartile — median) — (med ian — lower quartile) 

^ (upper quartile ~ lower quartile) 

..,...(13) 

This is a very cumbersome measure and difficult to calculate. 
Perhaps the most convenient measure is 

.(14) 

<7 

where is the third moment about the mean. This expression 
has the great advantage that it is susceptible of arithmetical or 
algebraical calculation. It will be observed that when the hump 
of a curve occurs at low values of x the skewness is positive, while 
a hump to the right gives negative skewness. (A symmetrical curve 
gives of course a zero result.) 


2*2 
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10. King’s formula. 

Before we leave grouped data we must mention two important 
formulae which will be needed later in the book. They are commonly 
known as “King's Formula" and “Hardy's Formula". 

Let Maj be a function whose fourth and higher differences are 
negligible. 

Let 

^—1 ~ 3m—1 “b 3m “b • • • "i" > 

^1 “ ^m+1 *^m+2 ■+"••• ^3m+l • 

Wq therefore three consecutive groups each of 

2m +1 values. 

We wish to express Wq, the middle term of the middle group, in 
terms of Wo 
Stirling's formula is 

Wx = Wq + “ (^*^0 + + • • • • 

2 212 

{Mathematics for Actuarial Students^ Part II, p. 64.) 

Summing from — m to +m: 

«;o = (2m+1)1/0 +A%_i(i2h- 22+...+m*) 

.(.5) 

Summing from — 3m — i to 3m +1: 

.(> 6 ) 

Hence —2Wo+tt^i=(2m+1)® 

Substituting for (^S)* we obtain: 

, . m(m+i) , . 

Wo = (2W + I) «o + Kl - + Wl) 

I r m(m+i) , ."] 

“•-sn L“’"~65ST7? 

. (■?) 


or 
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Putting 2»t+1 = «, this may be written in the more usual form 

.(■*) 


where A now denotes the differencing of group totals and not 
individual u’s. 

When « = 5, we have 

= •zwq — •oo8A2«;_i, .(19) 

the usual form of King’s formula. The more general expression (18) 
should, however, be remembered. 

So far we have considered only odd values of n. 

If n is even (= 2r say), let 

Wq — U 2r—1"^^ "i"^2r-3“^^2r—I 

2“ 2 ~2 2 2 2 2 * 

Proceeding as before, and remembering that 2w-M =«, we have 

0 u 12 A 


^ /^2_j\ 

= nu^+~ - - corresponding to (15), 

24 

Formula (18) then follows as before, but it will be seen that 
j r* _j "j 

- |^Wo““ g no longer gives the central term of the middle 


group (there is no central term when n is even) but the value of u 
for an argument half-way between those of the two central terms. 
For instance, if the data are given for groups 40-43, 44-47 and 
48-51 the application of the formula would give 1/45^. 


11. Hardy’s formula. 

This formula can be applied only to continuous functions. Un¬ 
like King’s formula therefore it cannot be applied to functions 
such as (the “initial” exposed to risk), which is discontinuous 
at the end of every year of experience. On the other hand, JFJ, 
the'exposed to risk in central form, is to all intents and purposes 
continuous. It is true that it increases or decreases by whole 
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numbers (or, in certain formulae, by fractions), but entrants and 
exits are allowed for as they occur and there is no sudden ‘‘jump** 
as with Egg, Hardy’s formula can therefore.be applied to Bh. 



Leta)_i=j 

I'-n/t 

f{x)dx, 

' - 3»/2 

represented by the area 

PQQ'F, 

"•-J 

p»/2 

f{x)dx, 

'-n/2 



QRR'Q', 

II 

1 

C 3 n /2 

f{x)dx, 

'n/2 

» 

)) 

RSS'F. 


Hardy’s formula gives the central ordinate in terms of the 
three areas, i.e. /(o) in terms of to_i, and tOi, assuming fourth 
and higher differences to be negligible. 

Let f{x) = a+bx+cx^+d 3 ^. 

1 * 71/2 ^3 

'Then Wo= f(x)dx=na+ — c . (20) 

J — n/2 12 

rSn/2 g»3 

«»_i+Wo+«'i= f{x)dx=yia+^c 
J - 3 n /2 4 

and A*w_i=w_i-2n)o+Wi=2n®c, (21) 

where A is the operator for differencing grouped values. 

Hence, from (20), Wo=«fl+^A*w_i, ■ 

and «, the central ordinate =-[Wo—^A®a;_J. .(22) 

This is Hardy’s formula. 

12. Application to exposed to ride and deaths. 

It is important that the student should realize the essential 
difference between King’s and Hardy’s formulae. 
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King’s formula is a finite dilSference formula enabling the central 
term to be found when only three group totals are given. 

If, for example, we are given S£'a, for groups 40-44, 45-49 
jmd 50-54, i.e. £40+£41+£'42+^43+^44, etc., the formula gives 
a value for the central term of the middle group. Similarly, 
if the functions given were 'LE% or 110 ^ (the number of deaths 
observed), the formula would give £^7 or as the case may be. 
Hardy^s formula cannot be applied to which is discontinuous. 

49 

In applying it to E% we must regard ^E% not as 

45 

^45 + El^ + Eij +£& + El^ 

but as an integral. 

Jo 

where denotes the number exposed to risk at exact age x+t. 
Hence Hardy’s formula gives (not P47). 

If we have the corresponding deaths ( 0 ^) similarly grouped it is 
difficult at first to see what function is given by Hardy’s formula. 
Actually the function is for the central point, because 

fix+tdt = deaths occurring between ages A and J 5 . 

Hence Hardy’s formula applied to gives and since 

-^474 has been found from 'LE% we can arrive at by division. 
This will be dealt with later in Chapter X. 

Example 8. 

7 14 21 

Given Smj,=3865, SUj.=a6i8 and S«*=i885, find «ii, assuming 

18 IS 

that fourth and higher differences are negligible. 

Denoting the groups by a)_i, and Wi, respectively, 

A*»_i=si4 

and «u=-r26i8 —^—i5h1 

■ 7L 24x7*-' 

= 371 almost exactly. 

Actually the data are the values of looc^ from ages 75 to 95 inclusive 
in E.L. No. 8, where ioo^„ (corresponding to «ii) is 372. 
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Example 4. 

6 12 18 ^ 

If we* were given Swa. = 34oi, 21/^=2434, 2 m^=i 778, i.e. an eom 

1 7 13 

number of terms in each group, the application of King’s formula 

would give 6^-1 

“•* = 6L"434-^.3XI 

= 404 (to nearest integer). 

It will be noted that is centrally situated in the group 7--12. 

By third difference interpolation, using the tabulated values of f/3,1/9, 
UiQ and Uii (i.e. looe^^, etc., in E.L. No. 8), the value of is found to 
be 404, so that on this occasion the group formula gives a good result. 


13. Weighted mean. 

This term sometimes leads the student to think that he is dealing 
with a further measure of statistical average, whereas all means are, 
in a sense, weighted means. 


Thus, in the standard expression for the mean, , the observed 

values of x are weighted with the frequencies with which they occur; 
e.g. a batting average is obtained by weighting the scores with the 
frequencies of their occurrence and dividing by the total frequency 
(number of innings completed). The phrase weighted mean is 
usually used in statistical books when the actual frequencies are not 
available and have to be estimated. Provided that the values of the 
variable are not greatly unequal and that the weights used are not 
wide of the mark, the value thus obtained will usually be very close 
to the true mean which would have been obtained by using the 
actual frequencies. 

Occasionally weighted means arise in another sense. Sometimes 
weights are applied to individual observations to allow for some 
element of relative importance other than numerical frequency. 
Weighted means of this kind are to be regarded as indicators or 
indices of some condition rather than as averages. 


14. Index numbers. 

Economists often wdsh to have a single measure of the combined 
results of many factors operating together. Index numbers are 
commonly used for this purpose. They are a special type of weighted 
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mean of which perhaps the best known is the “Cost of Living 
Index”, which reflects the effect of changes in price from year to 
year on a fixed “basket” of goods bought by working-class 
housewives. 

A convenient year was taken as a base and the figure for that 
year taken arbitrarily as loo. For many years the “ Cost of Living 
Index” was based on prices in 1914, but a more up-to-date base 
year is now desirable and the necessary data have been collected. 
The present (1949) index is only an interim arrangement. 

Another example with which the student will meet in investment 
work is the “Actuaries Investment Index”, which gives a measure 
of how shares in certain broad groups are changing in value from 
time to time. 

One of the best examples in actuarial work, however, is provided 
by the Comparative Mortality Figure (C.M.F.) used after the 1921 
Census to compare the mortality in' different occupations vnth that 
in the country as a whole and for comparing different occupations 
among themselves. 

The C.M.F. may be represented by the formula 


1000 


Si” mg 


s 


where P®, = the number in the specified age-group x of the standard 
population, 

= central death-rate for the age-group x of the occupation 
specified, 

m% = central death-rate for the age-group x of the standard 
population, 

and the summation extends over ages 20-65. 

The standard population w;as based on the number of occupied 
and retired civilian males between the ages of 20 and 65 enumerated 
at the 1921 Census. The census numbers were scaled down so 
that the number of deaths expected between the limiting ages 
according to the rates of mortality m% was 1000. 

By the use of these reduced populations the expression for the 
C.M.F. reduces to SP^m^. If the result is less than 1000 the 
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mortality according to this index is lighter than the average; if it 
is more than looo the mortality is relatively heavy. 

Any index number is open to the objection that it may convey 
misleading impressions in exceptional circumstances, and against 
the C.M.F. it may be urged that ‘‘normal” weight is thereby 
attached to values of m® which may be based on very scanty data 
and may be quite unreliable in consequence. 

Index numbers cannot be expected to convey all the information 
given by the data they are intended to summarize, but they are 
nevertheless useful in reducing a mass of classified data to manage¬ 
able proportions for purposes of comparison. 

A change of base year aflFects the relative size of all the indices 
already calculated and for this reason a mean akin to the geometric 
mean has often been recommended. 

This would involve terms such as 

S y3*f • • • instead of x^yi, ^2^2, .... 

An objection is that if one of the y’s should happen to vanish in 
any given year (a by no means impossible occurrence) the whole 
index also vanishes. One advantage of this form of average is, 
however, that a change of base year does not affect the relative 
values of previously calculated indices. (The Actuaries Investment 
Index involves this type of geometric mean.) 
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EXAMPLES I 

I, An office ha8 analysed its new business figures over a number of 
years for endowment assurances with profits. The following table shows 
the distribution according to age next birthday at entry: 


Age next 
birthday 
at entry 

No. of policies 

Age next 
birthday 
at entry 

No. of policies 

15-19 

30 

45-49 

270 

20-24 

200 

50-54 

180 

25-29 

450 

55-59 

80 

30-34 

420 

60-64 

20 

3 S -39 

400 

65-69 

5 

40-44 

350 

70 and over 

Nil 


Assuming that the exact age is on the average *35 yr‘. less than the age 
next birthday, calculate the mode, standard deviation, mean deviation 
and quartile deviation. Apply any tests you know to check approximately 
the last two of these values and comment on the results. 

2. The following table gives a frequency distribution of ages of 
bridegrooms: 


Age of 
bridegroom 

Frequency 

Age of 
bridegroom 

Frequency 

15- 

15 

54- 

83 

18- 

550 

57- 

55 

21- 

3050 

60- 

40 

24- 

3653 

63- 

32 

27- 

2825 

66- 

24 

30- 

1674 

69- 

16 

33- 

1028 

72- 

II 

36- 

714 

75 - 

6 

39- 

466 

78- 

4 

42- 

312 

81- 

I 

45- 

238 

84- 

1 

48“ 

181 

— 

— 

51- 

no 




C^culate the mean, median, mode, quartile deviation, mean deviation 
and standard deviation. Apply the approximate relationships connecting 
the values of these indices to check your results. 
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3* The frequency distribution of a measurable characteristic x varying 
between o and 2 may be represented by the following expressions: 

The frequencies are proportional to ofi for values of x between o and i 
and to {z-xY for values of x between i and 2. 

Find by separate calculation in each case the values of the mean 
deviation, standard deviation and probable error of the distribution. 

4. A frequency curve fitting an observed distribution is given by the 
two equations 

sin^l 
y-a-acos dj* 

where 0 varies between -77/2 and 77/2. y is the frequency with which 
the value x occurs. 

Calculate the mean, median, mode, standard deviation, mean deviation, 
probable error and a measure of skewness. Draw the frequency curve. 

If the observed distribution relates to the number of hours’ sunshine 
recorded at Greenwich on each 21st of March over a period of 50 years, 
state what units a will represent in relation to x and in relation to y, 

5. The following table shows the number of deaths (in thousands) 
among the male population in England and Wales in the years 1930-32: 


Age at 
death 

No. of 
deaths 

Age at 
death 

No. of 
deaths 

Age at 
death 

No. of 
deaths 

0- 

97 

35 - 

18 

70- 

84 

5 - 

12 

40- 

24 

75 - 

72 


7 

45 - 

33 

80- 

45 


13 

50- 

44 

8S- 

20 


17 

55 - 

57 

90- 

5 

25- 

16 

60- 

68 

95 - 

I 

30- 

16 

65- 

81 

100- 

Total 

0 

730 


Calculate the mean age at death, the standard deviation of the age at 
death and the coefficients of skewness by formulae (12) and (14). Can 
you suggest a reason for the difference between the two measures of 
skewness? 

6. The following table shows the distribution in groups, according 
to sum assured and rate of premium per cent, of a number of policies 
effected with an insurance company: 
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Central 

sum 


Central rate of premium in group 


Total 
no. of 
policies 

Average 
rate of 

assured in 
group, £ 

10s. 

£1.10s. 

£s. lot. 

£3. IO». 

£4. lot. 

£s-1<»- 

premium 

£ 

s° 

1 

2 

6 

6 

2 

I 

18 

3*00 

ISO 

2 

I 

3 

8 

3 

0 

17 

3*03 

250 

0 

I 

3 

4 

I 

I 

10 

3*30 

350 

0 

3 

6 

7 

5 

I 

22 

3.27 

450 

2 

2 

4 

8 

4 

I 

21 

3-12 

550 

I 

I 

3 

7 

0 

0 

12 

2-83 

Total 
no. of 

6 

10 

25 

40 

15 

4 

100 

_ 

policies 









Average 









sum as¬ 

300 

300 

282 

310 

290 

275 

— 

— 

sured, £ 










Criticize the following observations regarding the figures and calculate 
any statistical measures you consider necessary in dealing with the 
questions mentioned: 

** The average values of the sums assured shown in the last line of the 
table vary between £275 and £310—i.e. a range of £35 in relation to an 
average amount of about 3^300—^while the average values of the rate of 
premium given in the last column vary between 2*83 and 3*30—i.e. a 
range of *47 in relation to about 3-00. It is clear from these figures that 
the sum assured under the policies is on the whole a much more stable 
quantity than the rate of premium per cent under the policies. Further, 
the arithmetic mean of the first three values of the average sum assured 
in the last line of the table is £294 as compared with a figure of about 
£292 for the second three values, while the first three premiums in the 
last column have an average value of 3*11 as compared with a corre¬ 
sponding average of about 3*07 in respect of the second three. Both of 
these results show clearly that the larger sums assured are associated 
with the smaller rates of premium and vice versa.” 

7. Find the mode of the following data derived from Friendly Society 
records (a) by a graphic process, (b) by an analytical process. 


Age at 

commencement 
of illness 

No. of claims 

Age at 

commencement 
of illness 

No. of claims 

16-18 

27 

40-50 

172 

18-22 

48 

50-55 


22-25 

40 

55-60 

52 

25-30 

64 

60-65 

40 

30-40 

*35 

Over 6 $ 

NU 
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8* From the following table of the exposed to risk in central form and 
the corresponding deaths per annum, find the values of and for ages 
32, 37, 42 and 47. How would you find the same functions for ages 27 
and 52? 


Age group 

Exposed to risk 
in central form 

No. of deaths 
per annum 

25-30 

7.300 

44 

30-35 

10,500 

74 

35-40 

14,700 

118 

40-45 

15*400 

140 

45-50 

14,000 

154 

50-55 

12,300 

160 


9. The following table shows the frequencies with which values of a 
continuous variable were observed to lie within the ranges shown. Find 
the fourth moment about the mean, using Sheppard’s adjustments. 


Value of 
variable 

Frequency 

Value of 
variable 

Frequency 

0- 

41 

•6- 

500 

•I- 

152 

7 - 

411 

•a- 

22s 

•8- 

260 

• 3 - 

318 

•9- 

72 

• 4 - 

470 

Over I 

NU 

• 5 “ 

560 
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IMPORTANT FREQUENCY 
DISTRIBUTIONS 

1. The binomial frequency distribution. 

For reasons which will be apparent when we come to consider 
Mortality Tables we shall denote the probability of a success by q 
and the probability of a failure hy p. 

If we have n independent trials with the probability of success 
at each trial g, we know that the probability of n successes is 
Similarly, the probability of « — i successes is nq^-^p and, generally, 
that of r successes In fact, the probabilities of o, i, 

2,...« successes are the successive terms in the expansion of (/>+ cff'. 

In statistics we are interested in Trequencies rather than prob¬ 
abilities as such and we usually imagine that the n trials are repeated 
N times. Then 


the txpQQitd frequency with which n successes will be obtained is Nq^y 



)} 

n—I 

y> )> 

Nnq”^^p, 

>> 

>> 

r 

>> >» 



y> 

o 

» »> 

Np\ 


Obviously the actual frequencies in a sequence of N repetitions of 
n trials will all be integers. 

The theoretical frequencies shown above will not, in general, be 
integers, but it is to be expected that the actual frequencies will not 
differ greatly from them. 

Before we proceed to calculate the various statistical constants 
of the theoretical distribution it is as well to examine exactly what 
assumptions have been made: the important practical example of 
mortality data will serve as a test case. 

If we were to consider n lives for each of whom the probability 
of dying within one year was the probability of r deaths would be 
Again, if there were N groups each of n lives, as before, 
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we should expect to find that r people had died in 
groups and the frequency distribution would be as follows: 


No. of deaths 

0 

I 

2 

... 

r 

... 

n 

No. of groups 
with above 
no. of deaths 

Np^ 

Nnqp^'^ 


... 

N”C,q^p”-'^ 

... 

Nq'^ 


Incidentally we sometimes omit the N and refer to the prob¬ 
abilities as “proportionate frequencies**, i.e. the proportion of 
^ials in which, say, r successes would be obtained out of n. 

We have assumed that all the N groups are exactly alike and that 
each of the n lives in each group has the same chance of dying within 
a year. In practice data rarely relate to the number of deaths because 
of the difficulties of ensuring that each life is counted once only in 
the exposed to risk and the deaths; it is usual to base an investigation 
on the number of policies, with certain adjustments with which we 
are not at present concerned. Thus a man who dies having four 
policies in force will probably give rise to four claims, and these 
will be reckoned as four separate deaths. In this way the inclusion 
of duplicates upsets the assumptions made above, and such factors 
as epidemics, wars, local environment, etc. may mean that the 
chance of one man dying is not independent of the chance of another 
man dying. 

Statistical methods should therefore be used with discretion in 
dealing with such data and results obtained by them should not be 
interpreted too dogmatically. 

2* Mean and standard deviation of the binomial distribution. 

In the general case of class-interval h we shall have the values 
o, A, zhy 3 A,... nh occurring with frequencies iVp^, Nnp^^\y ,.. Nq^, 

The total frequency=iV (/> + 5 )^= N. 

The mean is therefore 

^ [oNp^^-hNnp^^q 4- zhN'^C^P^^q^ 4-... + nhNq^] 

« nqh 4“ (n — i)p^^q 4-... 4* 3 ^*“^ 

wsnqh. 


(i) 
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The second moment about the origin (m^ is 

^ + h^Nnp^^q ++... + Wj"] 

= nqh^ + 2 (« — I )P'^^q +3 — —p^'-^q^ +... + . 

. (^) 

The expression in brackets is the first moment about — i of the 
distribution (/>4’g)"~^. From (i) we know that the first moment 
about —I is (w~i)g+i, and the expression (2) therefore reduces 

nqh^[(n-i)q+i]. 

The second moment about the mean (/xg) is derived by subtracting 
the square of the mean from Wg. It is therefore 

nqh^[{n — i)g4-i] — n^q^h^ = nq{i—q)h^ 

^npqh\ (3) 

Hence a=A ^npq. (4) 

The results (i) and (4) are very important. 

Example 1. 

A throw of 5 with two dice being counted as a success, we have ^ = 
where q is the probability of success. 

Hence, if four pairs of dice are thrown, the chances of four, three, two, 
one and no successes are the successive terms in the expansion of 

(i+D*. 

If the four pairs of dice are thrown 9^ times the frequencies of 0-4 
successes would be as follows, if the theoretical probabilities were realized: 


No. of successes 

4 

3 

2 

1 

0 

Frequency 


32 

384 

2048 

4096 


The mean of these is x32) + (^><3S4) + (^ ^^048)] = !, 

as it should be according to formula (i). 

Similarly, the standard deviation will be found to be Vf|, which is 

where 11=4, s=i,^=f. 


FMAsiii 


3 
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8. The normal curve of error. 

In its simplest form the equation of this curve is 

Y=e-^\ 

Any other form can be reduced to this merely by a change of 
scale and a change of origin. 

_ *!- 

The most common form of the equation is 

This is derived from the simple equation above by putting 

y=^ and X=~, 

Jo 

where j/q a are constants, the significance of which will be 
considered later. 

There is in fact only one normal curve, a fact which makes it of 
the first importance in statistical theory. 

— 

The curve y=yoe is clearly symmetrical about the jy-axis 
and approaches the *-axis asymptotically as ±c»; the total area 
corresponding to the total frequency of the distribution represented 
by the curve 


=JoJ_^« ^’’'dx=2yoj^ 


e ^'dx. 


X‘ 

Putting —2 = t the integral becomes 


2(7 


VzyoCf dt =(J) 


=V27ryo(T. 

[The proof that F (i )is outside the scope of this book.] 

Hence, if we denote as usual the total frequency by N, the 
equation of the curve is 


N 

y=^e 






■is) 


So far we have regarded a merely as a constant in the equation. 
We shall now show that it is in fact the standard deviation of the 
distribution. As the curve is symmetrical the mean is clearly zero, 
as are the mode and median. 
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The square of the standard deviation (s.D.) is 


JL f" 

iVj_c 


STTCJ 0 


N - 

VzTTOr 




Integrate by parts, taking x as the first part and xe 2or» as the 
second. 

Square of s.D. = ["* ( - ^ f” e~dx 

Vw L \ /Jo Vtt J 0 

= C72, 

since the first bracket vanishes at both limits and 

I— 

asTT 


/: 


e dx = — ~y as shown above. 
0 V2 


4 . Standard tables. 

When the total frequency N and a are both unity the equation 


of the curve is 


X* 

I -2" 

y-~==e 2 . 

\ 27 r 


Extensive sets of tables have been published for the curve in this 
form. The most important are those which give: 

(i) the ordinate y for values of x which are at very close intervals 
when X is §mall and y changing fairly rapidly, and for values 
of X at less frequent intervals, when x is large and j; is changing 
only slowly; 


(ii) values of 


/: 


0 V- 


^ e dx for different positive values of z. 


00 \ 277 


This function is usually denoted by represents the 

area of the normal curve lying to the left of the ordinate :)c = j2r. 

Since the area to the left of the origin is ^ (the total area being 
unity) it follows that the area bounded by the curve, the axis of x 
and the ordinates x=Oy x = z is 

Hence o^g represents the area of the curve lying between the 
ordinates x— ±z and can readily be obtained from the tabulated 


values of 
the result. 


r — 

J ~ooV27r 


e ^dx hy doubling and subtracting unity from 
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Full tables of the ordinate -f=e a and of the area |(i +a*) are 

\2lT 

given in Sheppard’s Tables of Area and Ordinate in terms of Abscissa. 
In using them for the general Normal Distribution, ar/cr must be 
taken as a new variable x' (say) for entering the table and the ordinate 
so obtained must be multiplied by Nja, while the area must be 
multiplied by N. 

Table I in the Appendix can also be used as shown in Example 2. 
The student will find this table reproduced in A Short Collection of 
Actuarial Tables for use in the examinations. 

The importance of this table will be appreciated when sampling 
is discussed (Chapter IV), but the following will at once be obvious. 

N 

Taking the general equation y= -p=^ e 2(r*, the probability that 

\ 2 tto 

an observation taken at random lies within k, say, of the mean is 
clearly the area of the curve lying between the ordinates x=—k and 
x=k divided by the total frequency: this probability is therefore 




( 6 ) 


By taking xja as a new variable the values of these probabilities 
can be read off (or interpolated where necessary) from the prepared 
tables. The following values are important: 


k 

Probability 

• 6745(7 

•5000 

<7 

•6827 

2a 

•9545 

3 ^ 

•9973 


Thus we see that about 95^ per cent of the total area lies between 
* = - 2<T and x=+20, while no less than 9973 per cent lies between 
— 3(7 and +30. 

Expressed differently, this means that in a normal frequency 
distribution about 95^ per cent of the observations lie within a 
distance 20 of the mean while about 997 per cent lie within a dis¬ 
tance 3a of the mean. 
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PROBABLE ERROR. MEAN DEVIATION 

5. Probable error. 

The first entry in the table relates to the probable error (Mathe~ 
matics for Actmrial Students, Part II, Chap. XII, para. ii). 

If we consider a general frequency curve y =/(*), the probable 
error (P) is given by the equation 

rM+P j I'd 

f(x)dx=U f{x)dx .(7) 

J M—P a 

where M is the mean and the frequency curve stretches from x=^a 
to x=b. Expressed in words this means that half the total frequency 
occurs for values of x between (mean—P) and (mean+P). An 
observed value of x selected at random is as likely to fall within the 
range Af—P to M+P as it is to lie outside that range. 

The probable error is rarely used nowadays; for the normal 
curve it is approximately equal to ‘6745(7. 

6. Mean deviation of the normal distribution. 

Since the mean and median are zero, the mean deviation 





= ‘7979(7 approx. 

This value for the mean deviation is also widely used in the form 
mean deviation = | standard deviation, .(8) 

but may be wide of the mark if the distribution considered is not 
approximately normal. 

Example 2. 

A normal distribution of a continuous variable has a mean of 13*4 and 
a standard deviation of 2*5. Find the probability that a value selected at 
random lies between the values 11*8 and 15*0. 
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Taking the mean as origin the limits become —1*6 and i*6. Hence the 
probability required is, by (6) above, 


2 

/277 x2s Jo 


e 2x2*6*div. 


To make use of the tables in the Appendix we proceed as follows: 
L#et dx^Z'^dx\ 

When A?= 1-6, x* = rS/z-s = *64. 

2 r 

The required probability is therefore -p e 2 dx\ which by 

\7 tJ 0 

of the tables is found to be *478 approx. 

More generally, the substitution x=ax' reduces the integral 

- I e 2<r*dx 

V zttg J 0 

2 pfc/o- 

to “ 1 =^ e i2 dlx?' . (q 


and the tables can be used directly. 


Example 3. 

The following table (A) gives the distribution of heights of 1000 men. 
Find the normal curve representing the same total frequency and with 
the same mean and the same standard deviation. Draw the curve and the 
histogram. 

Compare the value of the interquartile range obtained from the 
statistics with that of the corresponding normal distribution. 


Table A 


Stature in inches 

No. of men 
within these 
limits of stature 

Stature in inches 

No. of men 
^ within these 
limits of stature 

6I-S-62-5 

4-0 

69-5-70-5 

138-5 

62-5-63'5 

19*0 

70-5-71-5 

io8*o 

63 - 5 - 64 -S 

24-5 

71-5-72-5 

53-5 

64 -S- 6 SS 

40-5 

72-5-73-5 

47-5 

65-5-665 

84-5 

73-5-74-5 

2fO 

66-5-67-5 

«3-5 

74-5-75-5 

12-0 

67-5-68-5 

139-0 

75-5-76-5 

5-0 

6 S 5 - 6^-5 

1790 

76-5-77-5 

-5 
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a** 

Table B gives values of the function y* 2, 


Table B 


x ' 

y'=e '2' 

logy' 

x ' 

x '* 

y'=e "2 

logy' 


1*00000 

0 

2*6 

•03405 

7-53209 


•98020 

1*99131 

2*8 

•01984 

2-29757 


•92312 

“l *96526 

3-0 

* 01 III 

2-04567 


•83527 

1-92183 

3.2 

•00598 

3-77641 


•72615 

1*86103 

3-4 

•00309 

3-48978 




3-6 

•00153 

3-18577 




3-8 

•00073 

4-86439 

1-4 

, -37531 

1-57439 

4-0 

•00034 

4-52564 

1*6 

*27804 

1*44410 

4-2 

*00015 

4-16952 

1-8 

.19790 , 

7*29644 

4-4 

•00006 

5-79603 

2*0 

•13534 

1*13141 

. 4-6 

•00003 

5-40516 

2*2 

*08892 

2*94901 

4-8 

•00001 

6-99693 

2-4 

•05614 

7*74923 

5-0 , 

— 

5-57132 


There are two points to note about Table A. First, the frequencies 
ending in • 5 are probably caused by men whose height (to the degree of 
accuracy adopted) coincided with the limit of a group, e.g. 65-5 inches. 
It is customary in such circumstances to allot -5 to each of the two 
groups adjoining. Thus three men of height 65*5 inches would be 
counted as 1-5 in the group 64-5-^5-5 and 1*5 in the group 

The second point to notice is that the distribution is roughly sym¬ 
metrical and an attempt to fit a normal curve seems likely to be fairly 
successful in view of the run of the data. 

The general equation of the normal curve is 


N 

* ’ 0^1 


y=-l=^e 

\ 27 Ta 


2 a« 


referred to the ordinate through the mean as axis of y. 

Hence we need to find the mean (so as to fix the axes), the total frequency 
N and the standard deviation or. 

Assuming that the total frequency of each group is concentrated at 
the mid-point, we proceed thus (see table on p. 40): 

69 is taken as an arbitrary origin. Referred to this origin the mean is 

“ches. 

mean = 69 — • 145 = 68*855 inches. 
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i«a (second moment about the origin) =f^§, ignoring Sheppard’s 
adjustment. 


.% fi2 (second moment about the mean)=s:6*58o —( —'145)^ ignoring 


Sheppard’s adjustment, 


=6-559. 


Subtracting of the class-interval (Sheppard’s adjustment) from this 
value for ftg vre obtain 6*476 and 

G =^6*476 = 2*54 inches. 

The lower quartile separates the 2Soth and the 251st observations and 
from the last column we see that it lies in the range 66*5-67*5, which 
includes 123*5 observations. Hence the lower quartile 


77*5 

s- 66*5 + -^—^ = 67*13 inches. 

123*5 

Similarly, the upper quartile, which separates the 750th and the 751st 


observations. 


= 69*5 + =70-48 inches. 


2*C 

or alternatively s= 70-5 - =70-48 inches as before. 

interquartile range=3-35 indies. 
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It will be remembered that by definition one-quarter of the total 
frequency occurs for values of the variable between the lower quartile 
and the median and also between the median and the upper quartile. 

The difference between the lower quartile and the median is equal to 
the difference between the median and the upper quartile only in a 
symmetrical curve. 



Fig. I 


If we measure from the mean a distance equal to P, the probable 
error, in both directions we enclose half the total frequency. 

In a symmetrical curve such as the one considered, the median and 
mean coincide and it will be seen that the interquartile range is equal to 
2P=2 X •67450 in a normal curve. 

Hence the interquartile range = 2 x -6745 x 2-54 

= 3 - 43 » 

as compared with 3*35 obtained above from the data. 

This is one indication that the normal curve is not a perfect fit. 
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We proceed to plot the curve and the histogram representing the 
given data. 

The equation of the curve is 

V27r X 2*54 

To put this in the form y' = e ^, so that the given table of values 
can be used, we let 

y=T= - 

\27r X 2*54 

i.e. log3; = 3 - J log 27 r - log 2-54 + log 

and X = 2*54»'. 

log7r = *4971496, log2 = *30i0300 and log2*54 = *404»337. 

log3; = logy+ 2*19608. 

The following table is constructed from Table B by adding this 
constant to the given values: 


X 

logy 

y 

X 

logy 

y 

0 

2*19608 

157-06 

6*604 

*72817 

5-35 

•508 

2-18739 

153-95 

7*112 

-49365 

3*12 

1*016 

2-16134 

144-99 

7*620 

-24175 

1-74 

1-524 

2*11791 

131-19 

8*128 

1*97249 

-94 

2*032 

2*05711 

114*05 

8-636 

T-68586 

•49 

2-540 

1-97893 

95-26 

9-144 

1*38185 

•24 

3-048 

1-88339 

76-45 

9-652 

1*06047 

*ii 

3-556 

1-77047 

58-95 

10* 160 

2*72172 

•06 

4-064 

1-64018 

43-67 

10*668 

2-36560 

*02 

4-572 

1-49252 

31-08 

11*176 

3*99211 

*01 

s-080 

1-32749 

21-26 

ii*68i 

3*60124 

•00 

5-588 

1-14509 

13-97 

12*192 

3*19301 

•00 

6-096 

-94531 

8-82 

12*700 

476740 1 

•00 


In order to illustrate the rapid tailing off of the values of y more 
significant figures have been kept than can be used in drawing the graph. 
The curve and the histogram are shown in Fig. i. 


7 . Approximation of the binomial distribution to the normal 
curve* 

The binomial distribution may be represented by a curve drawn 
through n +1 points corresponding to the w +1 terms of the expan¬ 
sion. This diagram will be symmetrical if p-g, and if n is large will 
not be very different from the normal curve. 
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We shall now show that as n-voo the distribution does in fact 
approximate to the normal distribution. Incidentally, the series of 
points will at the same time approximate to a continuous curve. 

As » is to tend to infinity we can, without loss of generality, 
assume that n=2k. As p=q=^, the binomial distribution can be 
written 

The central term = N (i)®*' = 3 'o‘ 


To arrive at the equation of the normal curve this term must 
correspond to x = o, and if we imagine the points representing 
successive terms of the expansion to be at intervals of h, the term 
corresponding to x= ±rh will be 


N 


2 k\ 

(k-r)l(k+r)l 




Hence 


yrh k ik-i){k-2):..(k-r+i) 
yo (*+r)(*+r-i)...(*+i) 



(-*) 

...( 



l(‘+S 

)... 



.*. provided r<k 

- logJo = log - y + log +... + log ^ I 

-log(i+^)-log(i+|)-...-log 



-^(i+2 + 3+ ... + /•—i)-^—terms 

involving second and higher powers of i fk 
approximately. 


yrh=yoe 


.(10) 


To derive a continuous curve the interval h between the points 
must tend to zero. In that event rh becomes the abscissa x. 


T&us 


!l 

T 


kh^' 


We know that for the binomial distribution a^=npqhK 
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As n — zk and^=s}=^, 


a^^2k.\.\h^ and ^ = 

rJ 2(7 


Substituting in (lo), we obtain 


y^y^e 2(r«. 


8. The Poisson Distribution. 

In deriving the normal curve as an approximation to the binomial 
it was assumed that p = 5 and n-^00. In practical work this means 
that j!) and q should not be very dissimilar and that n should be large. 

In some fields of statistics, q is often very small indeed although 
n is so large that the product nq is appreciable. For instance, the 
chance of a given employee being involved in an accident in a factory 
(q) is usually very small but the number employed (n) is so large that 
nq, the chance of an accident occurring to someone is quite an 
important consideration. Similarly, the probability of any given 
person dying within a year is very small indeed except at high ages 
but the number exposed to the risk of death in most investigations 
into mortality is very large. 

In such circumstances the binomial distribution can be repre¬ 
sented approximately as follows: 

Let nq=m and assume that «->oo and ?“>o, m remaining finite. 


The probability of r successes in n trials = 


may be written 


r!(«-r)! 


qrpn-r which 


nf I mY 
rir n) 


r! \ n) {n—r)ln^{i—mlny* .' 

To simplify this expression we make use of Stirling’s approximation: 

Substituting for the terms involving factorials we obtain: 
n\ 

(n—r) 1 «*■(! —mlny 


‘ ^’’(i — (i — mjny ’ 
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As w->6o M~~l and (i —w/ny-^i. 

Hence (i3)-> i and (12) tends to the form: 

i-e. .(14) 

Thus the probabilities of o, i, 2 ... r successes become in the limit 
the successive terms of the series: 


m mr 

I H—i H— T~^ •• 
I! 


ttt 

‘rf 




•(IS) 


The distribution represented by the successive terms of (15) is 
called the Poisson distribution. 

It will be seen that the total probability for all values of r is unity. 
It is left to the student to prove that the mean is m and the standard 
deviation Vw». 


9. The use of the normal approximation. 

It may seem at first that the normal curve will not in general 
be a very close approximation to the binomial distribution. In 
this connection it will be realized: 

1. That the binomial distribution is a series of points approxi¬ 
mating to a continuous curve only when the number of terms 
is indefinitely increased. 

2. That the binomial distribution is finite, while the normal 
curve approaches infinity in either direction. 

3. That, unless p=3', the binomial distribution is skew, while 
the normal curve is symmetrical. 

In actual practice the approximation in the neighbourhood of the 
mean is much better than these considerations would suggest. 

If we measure a distance Ka from the mean in both directions a 
balance of errors is obtained and the frequency enclosed by the 
normal curve may be a satisfactory approximation to the frequency 
in the binomial distribution within the same limits. 

This is very important, because it means that the sets of tables 
prepared for the normal curve may sometimes be used for a 





STATISTICS 


46 

binomial distribution. Thus from the table on p. 36 we often assume 
that about two-thirds of the total frequency occurs for values of x 
lying between Af—a and where M is the mean and <7 the 

standard deviation, while about 90 per cent occurs for values of x 
lying between M—aa and M+aa. The student will, however, 
realize that unless n is fairly large or p is nearly equal to q these 
assumptions may be far from the truth. 

Example 4 . 

The following table gives the first 13 terms in the expansion of 
10,000 (‘QSN-*05)1®®, 

the figure of 10,000 being introduced to eliminate decimals. 


Value of 
variable 

X 

Frequency 

Accumulated 

frequency 

Value of 
variable 

X 

Frequency 

Accumulated 

frequency 

0 

59 

59 

8 

649 

9.368 

I 

3 “ 

370 

9 

349 

9.717 

2 

812 

1,182 

10 

167 

9.884 

3 

1.396 

2.578 

II 

72 

9.956 

4 

1,781 

4.359 

12 

28 

9.984 

5 

1,800 

6.159 




6 

1,500 

7.659 

13-100 

i6 

10,000 

7 

1,060 

8.719 

inclusive 




The distribution is clearly very skew. 

In the previous notation JV= 10,000, n = 100, p = *95, q = •05. 

The mean=w^ = 5. 

The standard deviation ='>lnpq = 10 V*0475 = z* 18. 

The limits M—ato M+a become 2*82 to 7*18 and include 75*37 per 
cent of the total frequency. 

The limits M—za to M+zcr become *64 to 9*36 and include 96*58 
per cent of the total frequency. 

If we used a Poisson distribution as an approximation we should take 
the Poisson parameter, equal to the observed mean viz. 5 and the 
standard deviation would be ^5 or 2*24 as compared with the more 
accurate value of 2*18. 

Example 5 . 

. An assurance company is about to assure a group of 10,000 lives all 
aged X for a sum of £100 payable on each death within one year. The 
company wishes to charge the minimum single premium which will 
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ensure that its probability of suffering a loss on the whole transaction will 
not be greater than one-fourth. 

Calculate the approximate single premium per cent which should be 
charged to each member, ignoring interest and expenses, and assuming 
that for a life aged x the probability of death within one year is *01. 

Discuss briefly the effect on the problem of each of the following 
variations: 

(i) the sum assured being £200 instead of £100; 

(ii) the number of lives being 100 instead of 10,000; 

(iii) the specified probability of loss being some figure other than 
one-fourth. 

Calculate the approximate single premium per cent in cases (i) and (ii). 

The probabilities of o, i, 2, ... deaths occurring are the successive 
terms in the expansion of (-99 + 

The terms are proportionate frequencies and are exactly similar to 
the frequencies we have been discussing except that N=i, 

The mean number of deaths is therefore nq or 100 and the standard 
deviation ^npq or 10 very nearly. 

Since n is large we can assume that the distribution is nearly normal, 
especially as only an approximate result is required. P, the probable error, 
is, in that event, •67450’ 

= 6 - 745 - 

If M is the mean, half the total frequency lies between M—P and 
or in a symmetrical distribution a quarter of the total frequency 
occurs for values of x greater than M+P. Applied to our proportionate 
frequencies this means that the total of the frequencies for deaths in 
excess of 100 + 6-745 is only one-quarter, i.e. if we base our premium on 
the assumption that 107 deaths will occur the premium will prove 
inadequate in less than one case in four. 

The required premium is therefore £10,700 or £1. il 5^. per cent, 
say. 

The effects of the modifications set out in the question are as follows: 

(i) This has no effect on the premium per cent since the amount of 
the claim is the same for each death. If the sum assured were not 
the same for all the lives the problem would be more complicated. 
This aspect will be discussed later. (Chapter III, Ex. 3.) 

(ii) The expression we now have to consider is (-99+ *01)^®® and 
the mean of the distribution formed by the terms of the expansion 

^ is unity. 

The standard deviation is Vioo x -99 x *01 = *995. 

We can no longer assume that the distribution approximates to the 
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nonnal over most of its range, because n is now relatively small. We can 
now, however, expand the binomial to obtain: 

Chance of no deaths occurring = (*99)^®® = -366 approx. 

„ one death „ = ioo(*99)»®(*oi) =*370 „ 

„ two deaths „ (-99)^8 (-01)2 = -185 „ 

2 

Hence the chance of more than one death is 
I-.366-•370 = -264, 

while the chance of more than two deaths is 

I - -366 — *370 — • 185 = •079. 

A premium based on one death occurring out of a hundred would be 
inadequate in more than one case in four and we must allow for two deaths 
by charging a premium of £2 per cent. 

(iii) The probability of one-fourth enables us to use the probable 
error when 10,000 lives are involved. For any other probability, 
say one-fifth, we might refer to prepared tables based on the 
normal curve. Such a table shows, for instance, that the prob¬ 
ability of an observation lying outside the ordinates a? = ± *8(7 (the 
mean being the origin) is *42371, while the probability that it lies 
outside the ordinates ± •9cr is *36812. 

Since the curve is symmetrical we deduce that the probability of a 
value being greater than *8(7 is half of *42371 =*21185, while the prob¬ 
ability that it is greater than *9(7 is *18406. 

By interpolation the chance that it is greater than *84(7 is *2, or one-fifth. 
In other words, if we were to charge a premium which could be 
expected to prove inadequate on only one occasion in five we should 
allow for a number of deaths equal to the mean + *84(7. 

For 10,000 assured we therefore allow for 100 + *84(7 x 10 deaths, or, 
say, 109 deaths, by charging a premium of £1*09 or £1. is. lod. per cent. 

For 100 deaths the normal curve is not a sufficiently good approxima¬ 
tion, and unless the probability of loss were small the labour involved in 
evaluating successive terms of the expansion would be heavy. 

Actually the premium of £2 per cent calculated in (ii) would cover 
two deaths and the probability that it would prove in^equate is only 
•079, i.e. less than one-twelfth. 
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EXAMPLES 2 

I. In lOO investigations into the mortality of 500 lives all aged 70, the 
number of deaths o<?curring in one year were as follows: 


Investi¬ 

gation 

no. 

No. of 
deaths 

Investi¬ 

gation 

no. 

No. of 
deaths 

Investi¬ 

gation 

no. 

No. of 
deaths 

Investi¬ 

gation 

no. 

No. of 
deaths 

I 

20 

26 

10 

51 

16 

76 

15 

2 

24 

27 

li 

52 

19 

77 

16 

3 

19 

28 

10 

53 

21 

78 

13 

4 

20 

29 

9 

54 

IS 

79 

17 

5 

7 

30 

12 

55 

16 

80 

16 

6 

8 

31 

9 

56 

12 

81 

15 

7 

25 

32 

13 

57 

13 

82 

16 

8 

H 

33 

15 

S8 

17 

83 

13 

9 

10 

34 

16 

• 59 

19 

84 

15 

10 

8 

35 

14 

60 

15 

85 

17 

II 

9 

36 

16 

61 

16 

86 

12 

12 

7 

37 

18 

62 

15 

87 

17 

13 

8 

38 

13 

63 

12 

88 

20 

H 

18 

39 

17 

64 

14 

89 

22 

15 

16 

40 

19 

65 

17 

90 

IS 

16 

12 

41 

10 

66 

18 

91 

14 

17 

13 

42 

H 

67 

II 

92 

It 

18 

II 

43 

16 

68 

15 

93 

16 

19 

9 

44 

19 

69 

17 

94 

17 

20 

18 

45 

22 

70 

21 

95 

12 

21 

H 

46 

15 

71 

13 

96 

15 

22 

16 

47 

13 

72 

15 

97 

16 

23 

II 

48 

18 

73 

14 

98 

19 

24 

21 

49 

13 

74 

18 

99 

23 

25 

12 

50 

IS 

75 

20 

100 

16 


Calculate 


(a) The mean number of deaths. 

No. of deaths in one year 

(o) The mean rate of mortality o-o=-jrr;-;-—r • 

No. of lives investigated 

(c) The standard deviation of the number of deaths. 


Would you modify your method of calculation of item (6) if the 
number of lives in each investigation had not been the same? 

If the frequencies of the numbers of deaths in these 100 investigations 

PMASiii 


4 
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could have been represented exactly by the binomial distribution 
having a mean equal to the observed mean, calculate: ' 

(a) The standard deviation of the rate of mortality. 

(b) The number of investigations in which exactly 15 deaths would 
have been recorded. 

2. A large issue of Bonds of £100 each is redeemable through the 
operation of a sinking fund, by annual drawings at par, the proportion 
redeemable at the next drawing (which is to take place shortly) being 
I per cent of the total outstanding. The current market price of the Bonds 
is ^ 110, so that an immediate loss of j^io arises in respect of each Bond 
drawn for repayment. 

A holder of 5000 Bonds, whilst anticipating a loss of £500 on the above 
basis at the next drawing, desires to effect an indemnity policy to cover 
him in respect of excess losses arising if the proportion of his holding 
drawn for repayment exceeds the anticipated i per cent, e.g. if 51 of his 
Bonds are drawn, he requires indenmity to the extent of £10, and so on. 
Calculate approximately the net premium required. 

3. A normal distribution has a mean of 7-52 and a standard deviation 
of 2*38. Using Table I in the Appendix calculate the probabilities 

(i) that a value selected at random is greater than 12*00; 

(ii) that a value selected at random lies between the limits 6*oo and 9*00. 

(Note that these are not equidistant from the mean.) 

4. A large transport organization decided to increase all its passenger 
fares on ist January 1941 by 10 per cent. You are asked to estimate the 
passenger receipts for 1941 on the assumption that the volume of traffic 
remained unchanged. The 1940 figures were: 

(а) Passenger receipts 3^1,250,000. 

(б) Fare, i Jd. per mile. 

(c) Average mileage per journey 50. 

In calculating the revised fares, fractions of a penny are to be taken as one 
penny. How would your estimate vary if fares were to be calculated to 
the nearest penny, halfpennies being taken as one penny? 



CHAPTER III 


CORRELATION 

1. Hitherto we have considered only a single variable and the 
frequencies with which it occurs. When we investigate two variables 
X and j and the frequencies with which pairs of values are associated 
we meet the phenomenon of correlation. 

Suppose, for instance, that we tabulate the height to the nearest 
inch of a number of fathers and their eldest sons. We might have 
a table similar to the following: 

Table I 


Height of father 
(nearest inch) 

Height of son 
(nearest inch) 

63 

6s 

64 

62 

6s 

67 

65 

70 

66 

64 

66 

66 

66 

71 

67 

70 

67 

66 

68 

68 

68 

70 

68 

72 


Height of father 
• (nearest inch) 

Height of son 
(nearest inch) 

69 

67 

69 

69 

69 

73 

70 

68 

70 

69 

70 

74 

71 

67 

71 

70 

72 

70 

72 

73 

73 

70 

74 

74 


This is an example of the simplest type of correlation table, which 
consists merely of a list of observed values of x and the corresponding 
values oiy. These values need not necessarily be arranged according 
to a definite scheme, and if there happened to be two observations 
for which a given value of x was associated with a given value of y 
they would appear as two separate items in the list. For instance, in 
Table I we might have had two fathers of height 70 inches to the 
nearest inch shown as having sons of height 68 inches to the nearest 
inch. This would be represented by two separate entries in the 
table. 


4-2 
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In the more general type of table the data are extensive and have 
to be grouped so as to show the frequencies with which values of x 
within a given range are associated with values of y within the 
various ranges adopted in grouping. For instance, Table II shows 
for a given year how maximum and minimum temperatures were 
associated. As frequencies have to be shown as well as values of the 
two variables, a double-entry table is used. 

Table II 


Maximum 
temperatures 
in degrees 
• Fahrenheit 

Minimum temperatures in degrees Fahrenheit 

Below 31 

31-39 

39-47 

47-55 

ss-43 

Over 63 

Below 45 

10 

30 

5 

— 

— 

— 

45-55 

10 

50 

40 

10 

— 

— 

55-65 

— 

10 

. 50 

40 

5 

— 

65-75 

— 

— 

10 

30 

10 

— 

75-85 

— 

— 

5 

IS 

25 

— 

Over 85 

— 

— 

— 

— 

5 

s 


We usually require to know to what extent one variable varies 
with another. The extremes of these comparable variations are 
important: 

(a) If large values of x tend to be associated with large values of 
y there is said to be positive correlation. 

(b) If large values of x tend to be associated with small values of 
y and small values of x tend to be associated with large values 
of y there is said to be negative correlation. 

The figures in Table II suggest positive correlation. 

The assessment of the magnitude of correlation requires con¬ 
siderable analysis and usually involves the calculation of an index 
known as the coefficient of correlation. This will be dealt with in 
para. 4, but before we proceed to the analytical details there are one 
or two general principles which should always be borne in mind. 
The application of them to a given set of data may even indicate 
that analytical wofk is unnecessary and liable to produce misleading 
results. 

In the first place it is important to see that the pairs of observations 
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have some definite link quite apart from the association which it is 
desired to measure. For instance, in Table I the link is that of 
father and son, while in Table II the pairs of readings relate to the 
same day of the year. 

Secondly, although the coefficient of correlation will nearly always 
have to be found, it should be remembered that unless some 
hypothesis is made concerning the mathematical form of the popula¬ 
tion the parameters of which are to be estimated from the given 
data as set out in tabular form (as for instance in Table II), these 
actually give more information than any single index can do. 
Careful examination of the data will often, therefore, give useful 
information which is submerged in subsequent analytical work. An 
example of this will be given later. 

Finally, it must never be assumed that correlation implies 
causation. Because x andy show a marked tendency to vary together 
it must on no account be inferred that a change in x will cause a 
change in y. 

An example will make this clear. An investigation of cases of 
sunstroke in a series of years and the amount of home-grown wheat 
in those years would probably show a marked degree of correlation 
between the two factors, although neither could be said to influence 
the other in any way. The true explanation would almost certainly 
be that a hot summer tends to produce a bumper harvest and many 
cases of sunstroke. Two variables which are correlated are, in fact, 
very often both affected by a common cause, or combination of 
causes, but only rarely is one caused directly by the other, 

2, Scatter-diagrams, 

A method of representing the given data which naturally suggests 
itself is to plot on squared paper the various associated values of 
X andy, thus producing what is known as a scatter-diagram. 

For instance, the data given in Table II could be represented by 
the following scatter-diagram (Fig. 2) if the observations were 
assumed to be concentrated at the mid-points of the intervals and 
if the group below 31 ” were taken as 23-31 and the groups at the 
other end treated similarly. Since the frequencies are small no 
serious error would be involved. 

Such a diagram does not usually of itself give any clear indication 
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of the presence or otherwise of correlation and is open to the objec¬ 
tion that it does not indicate the frequencies with which the pairs 
of associated values are observed. 

This latter objection can be overcome by making a three-dimen¬ 
sional figure, using a third variable z to represent the frequency, 
but this conception is only of theoretical interest. 

In order to condense the information conveyed by the scatter- 
diagram the following method is used. 



Fig. a 

Taking each observed value of x in turn, the mean value of y 
corresponding to it is plotted, the frequencies being allowed for in 
the usual way in calculating the means. Thus, corresponding to the 
assumed value of 35° minimum temperature, we find assumed 
maximum temperatures of 40®, 50° and 60® occurring with fre¬ 
quencies 30, 50 and 10 respectively, giving a mean temperature of 
^ [(30 X 40) + (50 X 50)+(10 X 60)]=48° approx. 

Similarly, the mean maximum temperature corresponding to a 
minimum temperature of 43® is found to be 57®, and so on. These 
mean temperatures are indicated in the diagram by dots with rings 
round them. 

When the data are extensive the simplification thus achieved by 
plotting the means is considerable and the means themselves will 
be found in general to lie on or near a smooth curve known as a 
r^ession curve. 
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In the same way we could take each observed value of y in turn 
and plot the mean value of x corresponding to it, thus obtaining a 
second regression curve. 

In the simplest examples these curves can be assumed to be 
straight lines and the correlation is said to be linear. 

In what follows, unless the contrary is stated, it will always be 
assumed that linear correlation is under discussion. 

In Fig. 2 the data are so scanty that an attempt to fit a more 
elaborate curve would be unjustified and the lines of regression 
have been roughly sketched in. The mean values of x are indicated 
by crosses. 

Having drawn the regression lines it is possible, as will be ex¬ 
plained later, to deduce approximately the coefficient of correlation. 
The usual way of calculating this index is, however, by the analytical 
processes discussed in the next few paragraphs. At the same time 
it should be emphasized that the analysis is based on the assumption 
that the correlation is linear and does not give any indication of 
whether such an assumption is justified. By plotting the means we 
can throw considerable light on this very important point. 


3 . Analytical approach. 

Suppose that we have ATpairsof observed values(jCi,yi),(:c2,j'2)> ••• 

n 

... (*„,>'„) occurring with frequencies/i./j, .../„ {'Zfr^N). 

1 

Several of the x^s may of course be equal while the j’s differ (e.g. in 
the example dealt with above the assumed minimum temperature 
43® is associated with assumed maximum temperatures 40®, 50®, 
60®, 70® and 80®), and the same applies mutatis mutandis to the j^’s. 

First consider the regression line which passes through or near 
to the mean values of jy found for each observed value of jc, and let 
its equation be 


where and q have to be found. 

Actually it is more convenient to revert to the original data rather 
than to deal with the various means. 

The pair of values denoted by in Fig. 3 occurs with 
frequency//. 
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The ordinate through P, cuts the line y^niiX+Ci at Q(, the 
ordinate of which is %*,+q. 

Hence the distance QtPt—yt—fthXi—Ci. 

If the line is to fit the data satisfactorily one obvious require¬ 
ment is that the total of all distances such as QtP( should be small, 
allowing for different signs cancelling and also for the frequencies 
involved, i.e. we should expect ^fi(yt—tniXt—Ci) to be small when 
the summation extends over all the observations. 



O Lt 

Fig. 3 

If we make this sum zero we obtain 

or, dividing by N, 
i.e. 

where y is the mean of all the ys and x is the mean of all the x*s. 

Thus the regression line passes through the point ( 3 c,y), which we 
take as a new origin, writing x^^^x+X^ and y^ =^y + Yf. 

The equation of the regression line is now Y^niiX and the 

QiPi= Yf-m^Xi. 

The expression wtiATJ*, where the summation extends 

over all the observations, is essentially positive, but if the fit is 
good it should be small. We therefore choose nii so as to mak e it a 
minimum. 

For a minimum (OTj being the variable), 

i.e. 



ANALYTICAL APPROACH 


57 


or 

. 

If we divide the numerator and denominator by iV, the number of 
pairs of observations, the denominator is a|, where 


is the standard deviation of all the jc’s. 


Denoting the numerator by jf>, the slope of the 

regression line, 

«i=4- .( 2 ) 

The equation of the line is 




or, referred to the original axes, 

(y-y)=^{x-x). 


.(3) 


This is known as the line of regression of y on x. 

Similarly, the line of regression-passing through or near the mean 
values of x found for each value of y is known as the regression line 
ofxony. 

Denote its equation by ;c=;w2j' + ^2* 

In Fig. 3 the line MiP^Ri parallel to the axis of x cuts this line 
at R^. 

The distance RiP( = x^—m2yi—C2. 

For a good fit we put 

and make 2/^ {Xf —a minimum. 

The first of these equations readily reduces to 

x=m2y + C2y 

so that the second regression line also passes through (x,y). 

Taking axes through this point the second expression reduces to 

Differentiating with respect to the variable Wg, and proceeding 
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as before, we find that for the expression to be a minimum 


tna= 




-P. 


( 4 ) 


where is the standard deviation of the y’s. 

Referred to the original axes the regression line of » on^ is there¬ 


fore 


x-x=^{y-y). 


( 5 ) 


P P 

The expressions ^ and ~ are known as coefficients of regression 


or regression coefficients. 

Before proceeding to the coeflScient of correlation let us consider 
exactly what the lines of regression enable us to do. 

Take, for instance, the line of regression of 3; on a; (Equation (3)). 

This gives, for any value of the ‘‘expected value” of y in the 
sense that “expected value” is used in the Theory of Probability. 
We infer, therefore, not that this value of y will in fact correspond 
to the chosen value of a; in a given observation, but that if a large 
number of observations were made, always keeping x the same, the 
mean value of the various jy’s would approximate closely to the 
value derived from the equation, as the number of observations was 
increased. 

If the value of x substituted in the equation is actually one of 
those included in the data the value of y found from the equation 
will not generally be equal to the mean of the jy’s in the data. For 
instance, we found on p. 54 that the mean value of y (max. temp.) 
corresponding to an assumed minimum temperature of 35°F. was 
about 48° F. If, however, we substituted 35 ior x in the equation 
of the line of regression of jy on ap we should not expect to obtain 
48°F. but a more reliable “expectation” based on the whole of the 
data, on the assumption that correlation was linear. 

Therefore to answer such questions as “What maximum tem¬ 
perature would you expect to correspond to a minimum temperature 
of 35® F.?” we should use the equation of the line of regression of 
y on X. Similarly, the regression line of ap on 3^ would give the 
expected minimum temperature corresponding to a given maximum 
temperature. 
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4. Coefficient of correlation. 

As was stated on p. 52, correlation is said to be 

(i) positive if large values of x tend to correspond with large 
values of y, and vice versa; 

(ii) negative if large values of x tend to correspond with small 
values of and vice versa. 

Very often, however, there is no apparent tendency for x and jv to 
vary together. Any given value of x seems to occur as often in 
conjunction with large as with small values of y, and similarly 
any given value of y is associated with large and small values of x. 
Correlation is then likely to be small, but we cannot be sure that 
it is negligible until we apply the tests derived from the Theory 
of Sampling (Chapter IV). 

If we take our origin at the means of x and y and denote the values 
referred to this origin by X and Y^s above: 

(i) for positive correlation the values of XY will tend to be 
large and positive; 

(ii) for negative correlation the values of XY will tend to be 
large and negative; 

(iii) for a small degree of correlation the values of XY will be 
small and fairly evenly divided between positive and negative 
terms. 

In other words, if/,, is the frequency with which the values X^ and 
Yy are observed together and S/,.=iV, the expression/) = Y^ 

seems a useful measure of the extent and sign of the correlation. 

Unfortunately p reflects the scales used for x and y. By altering 
the scale of either variable we alter />, while the correlation is of 
course the same as before. Scale can best be allowed for by measuring 
the x^s in terms of or^ and the y’s in terms of Oy. This will be done 
extensively when we come to consider Sampling. 

If we take as our measure of correlation it will be seen that 

a change in the scale adopted for either x or y affects numerator and 
denominator alike and the expression can claim to be an absolute 
(as distinct from merely relative) measure of the correlation. It is 
of course symmetrical in x and y and is known as the coefficient of 
correlation (usually denoted by r). 
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Coefficient of correlation r =. .(6) 

In view of the above remarks about scale it seems more logical to 
write the equation of the lines of regression in the form: 


y—y_ p x—x 


and 


x—x 


*x^V ^ 

P y- 


•( 7 ) 

.( 8 ) 


''X ''x^y ''y 

It will now be seen that the first factor on the right-hand side is 
r in both equations and the coefficients of regression can be written 
in the more usual forms: 




(coefficient of regression of y on x) — 



( 9 ) 


W2( » xony)=^r— .(lo) 

_ ^y 

Finally we have r=.(i i) 

All these results should be memorized. 

In practical work it is very desirable to set out the calculations in 
tabular form in such a way that x^ j7, dy and p can all be derived 
in turn. 

The student is already familiar with the method for calculating the 
first four of these values by selecting a convenient origin and scale 
and making subsequent adjustments. The same can be done in 
calculating p. 

By definition, p =where and Y^ are measured from 
their respective means. 

Suppose that we choose convenient origins so that the coordinates 
of the point representing the observation are x^^X^+x and 
yi= Yt+y- 

We first calculate This is generally called the product 

moment about the origin chosen, or, more correctly, about the 
axes chosen. 

Now 
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since x and y are constants, viz. the distances of the mean from 
the origin chosen. 

Since and are measured from the mean, Y^ = o, 


and we have 

^I.ftXiy,=p + xy, .(i2) 

P=^ytxty-xy. .(13) 


In other words, we find the product moment, using any con¬ 
venient origin, and then derive p by deducting xy^ where x is the 
distance of the mean of the x^s from the origin (allowing for sign) 
andy is the distance of the mean of they’s from the origin. 

The following examples should make this clearer, and it is 
essential that they should be closely studied and every stage of the 
working verified. 

Example 1. 

Calculate the coefficient of correlation for the following series of 
observations: 

Table III 


Year 

Average 
yield of 
Consols 
during year 

Average 
index 
number of 
wholesale 
commodity 
prices 
during year 

Year 

Average 
yield of 
Consols 
during year 

Average 
index 
number of 
wholesale 
commodity 
prices 

during year 

1810 

4-5 

171 

1823 

3-8 

107 

1811 

47 

164 

1824 

3-3 

106 

1812 

51 

147 

1825 

3-5 

124 

1813 

4-9 

138 

1826 

3-8 

108 

1814 

4-5 

137 

1827 

3.6 

108 

1815 

5-0 

131 

1828 

3-6 

97 

1816 

4-8 

109 

1829 

3-3 

95 

1817 

41 

141 

1830 

3-5 

97 

1818 

3-9 

160 

1831 

3-8 

99 

1819 

4-2 

135 

1832 

3-6 

94 

1820 

4-4 

124 

1833 

3-4 

90 

‘i8ai 

4-0 

“3 

1834 

3-3 

94 

1822 

3-8 

106 
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This is typical of the simpler problems where frequencies need not be 
specifically introduced as they are all unity, the table being simply a list. 
Take as origins: 

average yield of Consols during year = 4*0 (variable x), 
average index number during year= 120 (variable jy). 

Taking •! as the unit of x, the work can be arranged as follows: 


X 

y 


yi 

xy 

+ - 

+ - 



+ - 

5 

51 

25 

2,601 

255 

7 

44 

49 

1.936 

308 

II 

27 

121 

729 

297 

9 

18 

81 

324 

162 

S 

17 

25 

289 

8s 

10 

II 

100 

I2I 

no 

8 

II 

64 

I2I 

88 

I 

21 

I 

441 

21 

I 

40 

I 

1,600 

40 

2 

IS 

4 

225 

30 

4 

4 

16 

16 

16 

— 

7 

— 

49 

— 


14 

4 

196 

28 

2 

13 

4 

169 

26 

7 

H 

49 

196 

98 

5 

4 

25 

16 

20 

2 

12 

4 

144 

24 

4 

12 

16 

144 

48 

4 

23 

16 

529 

92 

7 

25 

49 

625 

17s 

5 

23 

25 

529 

"S 

2 

21 

4 

441 

42 

4 

26 

16 

676 

104 

6 

30 

36 

900 

180 

7 

26 

49 

676 

182 

62 58 

252 257 

784 

13.693 

2398 148 


It will be noticed that positive terms are shown on the left and negative 
terms on the right of each column. This facilitates the additions and 
lessens the likelihood of arithmetical errors which may arise if this is 
not done. 
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Since the total frequency is 25 we have 


=-62-58 

25 


= •16 units, 


_ 252-257 

y =-_ .20 units, 

25 

‘"* = “-(-i6)*=3i-3344. 
^5 


03.=5*6 units, 

13.693 
25 

ffy = 23-4 units, 


<^S = -:.--(-2o)=* = S47-68. 


2398-148 , . 

-(• 16) (-*20) = 90*03 units, 

25 


Coefficient of correlation r = - — r = •6q approx. 

(S; 6 )( 23 - 4 ) 

All the work has been done in class units, but if the mean yield of 
Consols and the standard deviation are required in terms of the original 
scale and origin we have the following results. 

Mean yield of Consols = 4*0 4* (*16) (*1) = 4*016. 

Standard deviation = 5*6 x *1 = *56, 

The mean index number is 120 — *20 = 119*8. 


Example 2. 

As a second example let us consider the data of Table II, which have 
been examined graphically earlier in this chapter. 

The chief difficulty arises in determining the product moment denoted 

aboveby^S/^jir,^;,. 

To do this, a simple device, used extensively, is to write the product 
close to each frequency and then to insert but not to attempt 
to write AovJv^f^x(y^ in one step. 

A little practical experience will convince the reader that the pre¬ 
liminary step of calculating x^y^ is well worth while. 

In the following table x^y^ is shown in the top left-hand corner of each 
division ^xidf^x^yi is shown in the bottom right-hand corner. 

As before, x represents minimum temperatures andjy maximum tem¬ 
peratures, but the origin has been taken at the centre of the 39-47 group 
(minimum temperatures) and the centre of the 55-65 group (maximum 
temperatures). The class-intervals have been taken as units, so that the 
frequencies, assumed concentrated at the mid-points of the intervals. 



um temperatures in degrees F 
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frequencies 
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occur for values of x from - 2 to +3 and for jy from - 2 to +3, although 
the class’intervals are different. 

The line below the data marked “Total frequencies’* is self-explana¬ 
tory. Since the “total frequency” 20 occurs iov x — —2 its first moment 
about the origin of ^ is — 40, the first entry in the next line. 

Similarly, the second moment is + 80, as shown immediately below. 

The last line, “Product moments of frequencies”, is obtained by 
adding vertically the products fiX^y^ previously inserted. 

The columns on the right headed “Total frequencies”, etc. are 
obtained similarly, but now moments are about the origin of e.g. the 
total frequency 45 occurs for the value y = 2, so that its first and second 
moments are 90 and 180. The “Product moments of frequencies” are 
not needed a second time, since the bottom line gives us 'Lf^x^y^ = ^l^. 

The other columns and rows are added as shown and we proceed as 
follows: 

^=^5 = *192 class units, 

= ^|^ — (-192)2 = 1-306, in terms of class units, 
y = — = — -082 class units, 

cr2 = — (— *082)2 = 1*664, terms of class units. 

Product moment p = - (*192) (- *082) 

= 1*153, in terms of class units. 

Coefficient of correlation r = _ = -782. 

Vi-306 X 1*664 

It will be noticed that all the work has been done in class units, which 
is the simplest plan if only r is required. 

In terms of degrees Fahrenheit (as given in the original data) 

Mean minimum temperature=43 + *192 x 8 = 44-5® F. approx., 

since the origin is 43 and scale (class-interval) is 8. 

Similarly, 

Mean maximum temperature = 60 4* 10 (— *082) = 59-2° F. approx. 

To obtain Uy and/> in terms of degrees we multiply by 8, 10 and 80 
respectively; the value of r is clearly unaffected, thus illustrating the 
advantage of r rather thanp as a measure of correlation. 

No adjustment has been made for the error involved by assuming the 
frequencies to be concentrated at the mid-points of the intervals. Such 
an adjustment is not usually made in calculating a coefficient of corre¬ 
lation. It is not easy to adjust p and, unless the data are very extensive, 
to d6 so would be an unjustifiable refinement. The o’s can easily be 
corrected, but there is no point in doing this if the numerator is not 
dealt with. 


FMASiii 
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This coefficient of correlation could'be obtained approximately by 
inspection from the diagram in which the lines of regression were 
drawn (Fig. 2). These lines intersect at (x, y) and their slopes give 
the coefficients of regression. 

The line of regression of y on jc is 


y-y = r-^(x-x), 

and hence the tangent of the angle which it makes with the :c-axis is 

Similarly, the tangent of the angle which the other regression line 
makes with the jr-axis is 


The product gives r* approximately; cf. equation (ii). 


5 « Properties of r« 

On p. 56 the coefficient of regression was found by making 

— a minimum. .(14) 

Substituting for mj, this becomes 

\ J ^x 

.(15) 

If there are N pairs of observations, the first and last terms are 
clearly ^2 

iVoJ and r^-^Na% respectively. 

Also Y,=Np=Nra^a„. 

Hence the right hand side of (15) reduces to 

N{<tI - 2r*a* + r^al) = iV(i - r®) aj.(16) 

But since is always positive the expression (i6) is essentially 
positive or zero. 

Hence i — must be positive or zero and 

.(17) 

In other words, r cannot be numerically greater than i. 
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The expression Xt^ 

<^x 

previously denoted by 

gives the difference between the value associated with X( and 
the ordinate of the point on the regression line Y=m^X with the 
same abscissa. 

Hence X^y^ which we have seen reduces to iV (i — r*) aj, 

gives the sum of the squares of these differences or deviations. 

If r=s ± I each of the deviations must be zero, since all the terms 
of the summation are positive. That is to say, when there is no 
“scattering**, but all the observations lie on the regression lines, the 
correlation is perfect and r = ± i according as correlation is positive 
or negative, (oy is not zero by hypothesis.) 

If r = o the expression (16) reduces to iVag, i.e. the sum of the 
squares of the deviation from y. 

This can happen only if th e regression line is parallel to the axis 
of X so that the average value of y for every value of x isy. 

When this happens there is no correlation and the points of the 
scatter-diagram do not tend to cluster round any straight line. 


6. Standard deviation of the sum or difference of two variables. 

Suppose that we have n values of a variable x with mean x and 
standard deviation cr^ and also m values of a variable with mean y 
and standard deviation cry. 

n n 

By definition na|=S(»-«)* 

1 1 


m m 

and niy^^y\ = 

Now suppose that a new variable z is formed, where z==x+y^ 
and that every value of x is associated with each value of y, thus 
producing mn values of z. 

We can find the mean and standard deviation of these values of 

ar as follows: i mn 

^(the mean) = — S ^ 
tnn 1 


1 mn 
— V 


J-* 
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Although there are mn different values of x+y on the right-hand 
side there are only n different values of x, each being repeated m 
tunes* MM M 


Similarly 


mn n 

1 1 

mn m 

I,y=nj:y, 

I 1 


since there are only m different values of y. 

j r n m ”1 

Hence — \m^x+nj^y\ 

1 1 J 


= — \mnx 4 - ntny'l 
mn 


Tf Og is the standard deviation, 


mnal=^ S 2 (x-x-k-y—y)^ 

1 1 

mn mn mn 

- 2 {x-xf+ 2 {y-yf+2 S {x-x){y-y). 
11 1 


As before, the term reduces to — and 

1 1 

mn m 

2 {y -yf to « 2 (>’ -yf- 

1 1 

The third term is a “ true ’’ summation in that all the mn terms are 
different. They can, however, be divided into sections as follows. 

A particular factor ic) is associated with every term of the 

m 

type y^—y and the sum of these products is (Xf,—x) S (jVa—JV)' 

«—1 
m 

But since y is the mean of the y's, 2 {y» —y)—o. 

80x1 

In this way we can split up the third term of (19) into n groups 
each of m terms the sum of which is zero. 


Hence 

t 1 

ad o-,=Va|+ff5. 






S.D. OF SUM OR DIFFERENCE OF TWO VARIABLES 69 


In the same way it can be shown that if z=x-y, then z=x-y 


More generally, if z = x±y±w±..., where each value of each 
variable is associated with all the values of the other variables in turn, 


z = x ±y ±w ± ... 

and + + .(21) 

We sometimes have to deal, however, with a slightly different 
problem. Suppose that we have n values of x with mean x and 
standard deviation and n values of y with mean y and standard 
deviation 

If a new variable z = x-{-y is formed by associating each value of 
X with one and only one value of y, correlation enters into the pro¬ 
blem. As an example x might be the height of a father and y the 
height of his eldest son, so that there is what is sometimes called a 
“one-to-one correspondence’*. 


I n 

z = --T,{x+y) = x+y, 
n 1 


(22) 


1 

= X(x-x+y-y)^ 

1 

=(^ - *)*+S (y -y )*+2 s (^ - *) (y. 

.(23) 

The first two terms are clearly na^ and ncrj, but the third term 
is quite different from that dealt with before. We can no longer split 
it up into groups such as (x^. — jc) S {y^ — j), since each x is associated 
with one and only one y. 

n 

The term however, the expression previously 

1 

denoted by np^ i.e. nro^Oyy where r is the coefficient of correlation 
bet^yeen x and y. 

Note: Correlation cannot arise if the values ofand y are not paired 
off but each value of one is associated with all the values of the other. 
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Substituting in (23) we obtain: 

norj=+naj 4* anroTajOTy 

and (Tg=\lal + aj + zra^Cy. .(24) 

Similarly, if 

and the result can be generalized for2r=;c±jv±tt;±.... 

It should be noted that correlation between every pair of variables 
has to be allowed for. 

Example 3. 

In Example 5 of Chapter II let us suppose that 5000 of the lives are 
assured for £200 each and the other 5000 for £100 each. 

The first group may give rise to claims for o, £200, £400, 3^600, ... 
3^1,000,000 according as o, i, 2, ... 5000 deaths occur. 

The scale is clearly 3^200 and it was shown in Chapter II that for a 
binomial distribution such as this the mean is nqh and the standard 
deviation h^lnpq. 

Hence the mean in this group of 5000 lives is 

£200 X 5000 X ‘01 = 3^ 10,000 
and the standard deviation is 

£200 V5000 X *01 X *99. 

The other group gives rise to claims for o, £100, £200, ... 3^500,000, 
the scale being now 3^100. 

The mean is therefore £5000 and the standard deviation is 
3^ 100 V5000 X ’oi X *99. 

The total claim is of the form x +y, where x refers to claims in multiples 
of £200 andy to claims in multiples of 3^100. 

The mean claim is x+y, i.e. 

3^io,ooo + 3£5ooo = 3£i5,ooo, as we should expect. 

The standard deviation 

sVorJ + OrJ 

=V(2oo* +100*) 5000 X *01 X *99 

= £^ 573 ^ 

The probable error is therefore *67 x 3C1S73 =3£i054 approx., and to 
reduce the chance of claims for more than £15,000 to one-quarter, the 
total premium should be 

3^(15,000 +1054) or 3^1. IS, sd. per cent, approx. 

No question of correlation arises here, since any life assured for 3^100 
can be associated with every life assured for £zoo. 
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7. Non-linear regression. 

When the means of the x*s and jy’s of the scatter-diagram cannot 
reasonably be assumed to lie on straight lines the preceding analysis 
breaks down and r is misleading as a measure of the relationship 
between x and y. 

It will be remembered that for linear regression we fitted a line 

X Y 

Y^tn^X to the data, where ^ the value of Y thus 

na% 

obtained being the “best” value or expectation corresponding to 
the value X. 

If we denote the “best” value corresponding to X^ by Y/, while 
the actual observed values are Y,,, y,., y,., etc., then Y/=:m^X^ 
for all values of t. 




gx_ Y.xy 

— TFl-^ — , 


the value previously denoted by r. 

From this point of view r is the ratio of the standard deviation 
of the estimated values read off from the regression line to the 
standard deviation of the original observations Yh> y/,. etc. 

Very often a curve is more suitable than a straight line for showing 
the relationship between x and y, and it will be seen later that the 
process of graduation is in effect the fitting of a curve showing 
correlation between age {x) and mortality (j), the graduated curve 
being a curvilinear regression line. 

From such a curve the values of Y' corresponding to each x can 
be measured and Gy* calculated. Actually this is the same as (Xy/, 
where y* is measured from the origin instead of from the mean. 

Gy^lcy can be regarded as a measure of correlation and unless 
regression is linear it is known as the index of correlation. 

In the special case where the regression curve passes through all 
the mean y’s Oy^joy is called the correlation ratio and is usually 
represented by rf. 

The reader will now be in a position to appreciate (and criticize) 
the following definitions of correlation: 


(i) If two quantities vary in sympathy so that a movement in 
one tends to be accompanied by a movement in the other, 
they are said to be correlated. 
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(2) Two variables are said to be correlated when we do not find 
a fixed value of the one variable equally likely to be associated 
with different values of the other. 

We close this chapter with an example which presents many 
points of interest. 

* Example 4. 

The following data have been collected for the purpose of investigating 
the correlation between the duration of life of married and widowed 
women and the number of children. Calculate the correlation coefficient 
and mention any approximate measures which might be used to indicate 
the extent of the correlation, stating the objections to these measures. 


Table showing ages at death in quinquennial age 
groups, of 1095 wives and widows, with particulars 
of the number of children 

Table showing the dis¬ 
tribution of the deaths 
according to number 
of children 

Central age 
at death 

No. of 
deaths 

Total no. 
of children 

Average no. 
of children 

No. of 
children 

No. of 
deaths 

20 

29 

36 

1*2 

0 

24 

25 

87 

151 

17 

I 

130 

30 

99 

261 

2-6 

2 

122 

35 

109 

478 

' 4-4 

3 

134 

40 

90 

450 

5-0 

4 

III 

45 

87 

437 

5-0 

5 


50 

64 

370 

5-8 

6 

85 

55 

54 

331 

6-1 

7 

91 

60 

69 

430 

6-2 

8 

81 

65 

73 

447 

6*1 

9 

77 

70 

83 . 

547 

6-6 

10 

58 

75 

77 

590 

7.7 

II 

23 

80 

1 

78 

547 

7-0 

12 

24 

85 

59 

398 

67 

13 

15 

90 

26 

212 

8-2 

14 

6 

95 

' 7 

SO 1 

7.1 

15 

2 

100 

4 

35 

8-8 


2 

— 

— 

— 

— 

17 - 

2 

— 

— 

— 

— 

18 

2 

Total 

1095 

5770 

_ 

— 

1095 


Find an equation for the regression of the number of children on the 
length of life of the mother and plot this on a graph together with any 
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information from the given data which will show whether the regression 
line is satisfactory. On consideration of your graph state whether the 
correlation coefficient may be regarded as the best measure in this 
particular case. 

The following additional information is given: 

Mean age at death. 53’292 

Standard deviation of age at death calculated 4*091 
on a unit of 5 years 

Standard deviation of number of children 3 * 4^9 

To investigate correlation we should expect the data to be given in the 
form of a double-entry table with (say) age at death along the top (x) and 
number of children down the left-hand side (jy), the class-interval for x 
being taken as 5 years. 

Actually we are not given the data to fill in the squares but we are 
given the total of each column, i.e. the^ number of deaths for a specified 
central age at death, and also the total of each line, i.e. the number of 
deaths for a specified number of children. 

The column “Average number of children’* is unnecessary, since it 
can be obtained by dividing the “Total number of children” at each 
central age by the number of deaths at that age. It will, however, be 
found useful later on. 

V 

We are given S/ for each value of x^ i.e. the total frequency for each 

X 

central age at death irrespective of number of children, and also S/ for 
each value of i.e. the total frequency for each number of children 

V 

irrespective of age of mother at death. (S is used to denote summation 

X 

with regard to y and S summation with regard to x.) Hence we could 
calculate jc, y and Uy in the usual way; three of these values are given, 
but they should be checked as an exercise. 

The only difficulty is in finding the product moment llifxy. 

The method normally used breaks down because we do not know the 
individual values of / for every pair of associated values of x and jy. We 
are given the total number of children for each central age at death but 
not how many mothers dying at that age left o, 1, 2, 3, ... children. 

The given data are in feet the values of T^fy for each central age at 
death, where represents the number of children and/the frequency. 

To deduce 2/ry is a simple matter, since x is the same for all the values 
in a given column. 

We assume the frequencies concentrated at the mid-point of the 
intervals and take the class-interval as the unit. The origin of x (the age) 
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is taken at 55. There is little to be gained by altering the origin of y. The 
work is as follows: 


Table V 



No. of 

No. of 

Product moments 

Age 

deaths 

children 

(i)x( 3 ) 

X 

i/ 

iyf 

tfxy 

(i) 

(a) 

( 3 ) 

( 4 ) 




+ 

-7 

29 

36 

252 

-6 

87 

*5* 

906 

-s 

99 

261 

1.305 

-4 

109 

478 

1,912 

-3 

90 

450 

1.350 

-2 

87 

437 

874 ■ 

— I 

64 

370 

370 

0 

54 

331 


I 

69 

430 

430 

2 

73 

447 

894 

3 

83 

547 

1.641 

4 

77 

590 

2.360 

S 

78 

547 

2.735 

6 

59 

398 

2,388 

7 

26 

212 

1.484 

8 

7 

50 

400 

9 

4 

35 

315 

Total 

1095 

5770 

-6,969-412,647 




= 5.678 


Clearly y (the mean number of children per death) 5-269. 

There is thus no need to refer to the last two columns given in the 
data unless it is desired to check the given value of (Tj,. 

The product-moment about the chosen origins is yf 
The values of the means are 

ic = J (53*292 - 55) class units, measured from 55, 

= - -3416 class units, 

y = 5-269* 

p (product moment about the means)=+ (^3416) (5*269) 

=6-985. 
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Coefficient of correlations 


p 6-985 
(4-091) (3-409) 


= •50 approx. 


Note that since p and are both expressed in class units there is no 
need to make any adjustment. 

An approximate value of r can be obtained as follows. 

For each value of x we are given not the frequencies with which each 
number of children was observed but the average number of children. 
We cannot therefore draw a scatter-diagram, but we can at once plot 
the mean value of y for each value of x and try to fit a straight line by 
inspection. As x increases from 20 to 100, i.e. 16 units of 5 years, the mean 
value of jy increases from i*2 to 8-8, i.e. by 7*6. 


Hence the slope of the regression 






Since = 


this gives 


t 


7*6 4*001 
l —= .r7, 

16 3*409 


The observations at ages 20 and 100 are, however, very scanty, and 
it seems preferable to base an estimate on the figures for ages 30 and 90, 
taking the mean value at age 90 as 8*o instead of 8*2 to get a better run of 
the figures there. 

On this basis wii=— and r =—. = •ej,, 

12 12 3*409 

The objection to both these estimates is of course that they are based on 
observations at two ages only and not on the general run of the means. 

The equation of the regression line of y on :c is 


y-y = r^(x-x), 


or 


2*400 

3^ - 5-269 = -5 (x + -3416). 
4-091 


Reverting to the original scale of x and the original axes, this becomes 


or Y= •083-?'+*84, approx. 

* 

This line is drawn on the diagram on p. 76 together with the points 
representing the mean values of y for each value of x, 

A glance at these means shows that they cannot in fact be represented 
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satisfactorily by a straight line, i.e. that the regression is non-linear. The 
coefRcient of correlation calculated is therefore misleading. 

The curve shown seems to be a reasonably good approximation to the 
run of the values and may be taken as the curve of regression. 

From general considerations we should expect that from age 20 to 45 
the number of children would increase with the age at death, since the 
size of family must depend to a large extent on the duration of life during 
the child-bearing period. Once age 45 has been passed, however, we 
should not expect the number to increase except to the very small extent 
which reflects the superior vitality of healthy women who have larger 
families than the average. 

The average number of children increases fairly regularly throughout. 
The explanation is almost certainly to be found in the fact that the given 
data relate in all probability to the deaths of married women and widows 
over a short period of time; if the average age at motherhood is taken 
as 30, this means that women dying at 90 had their children 60 years ago 
when large families were the rule rather than the exception. Similarly, 
those dying at 60 had their children on the average about 30 years ago. 

Thus the regression curve reflects the variation of number of children 
not so much with age of mother at death as with the period when the 
children were bom. The chief factor operating has thus been a steadily 
falling birth-rate. 

This question illustrates the difficulties involved in interpreting the 
results of an investigation into correlation. 



Diagram. Regression of number of children on length of life of mother. 
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EXAMPLES 3 

I. A random sample of 170 cases has been taken from the new policies 
issued in 1936 by a certain life office, and the distribution of the sample 
with regard to age at entry and sum assured is found to be as follows: 


Age group 


Sum assured ^ 


Total no. 

£50 

£100 

£200 

£500 

£1000 

of policies 

is-24 

18 

20 

6 

2 

— 

46 

25-34 

21 

26 

6 

5 

I 

59 

35-44 

10 

9 

3 

6 

I 

29 

45-54 

7 

8 

5 

4 

— 

24 

55-64 

8 

3 

I 

— 

— 

12 

Total no. of 
policies 

64 

66 

21 

17 

2 

170 


Calculate the coefficient of correlation between age and sum assured 
(a) using only the data for ages at entry up to 44, and (b) using all the 
data. Comment on your results. 

2. The following table shows the average Sum Assured under new 
assurances effected in 50 Insurance Offices in a particular year, and the 
expense ratio of these Offices for the same year. 

Calculate the coefficient of correlation between the average new Sum 
Assured and the expense ratio. 
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Average 

sum 

assured 

£. 

Expense 

ratio 

Average 

sum 

assured 

£ 

Expense 

ratio 

Average 

sum 

assured 

£ 

Expense 

ratio 

784 

I2-6i 

491 

10*76 


15-94 

389 

16*85. 

675 

22*91 

807 

20*23 

301 

2374 

1002 

14-58 

1158 


3SS 

13-25 


18*64 

815 

17-44 

687 

22*26 

346 

16*70 

757 

15-46 

596 

13-72 

363 

14-89 

855 

18-45 

748 

i6*2I 

1097 

15-33 

718 

14*32 

660 

237s 

646 

13-81 

793 

21*48 

900 

15*90 

498 

16-98 

553 

•16*21 

629 

24*28 


13-98 


15-42 

699 

12*28 

626 

15-96 

631 

19-04 

474 

21*71 


14-17 

710 

14-73 

932 

177s 

404 

11-96 

195 

17-30 

941 

13-65 

289 

13-61 

867 


689 

16-05 


17-77 . 

656 

20*90 

797 

15-80 

675 

13-94 

1002 

17-13 

535 

20*29 

lOII 

16-49 




The average Sum Assured = 


Total new Sum Assured (less re¬ 
assurances given off) 

Total number of new policies 


Expense ratio = 


Total expenses for the year less 5 per cent 
of single premiums 

Total premium income for year (new and renewal) ’ 
less single premiums 


Do you consider the coefficient, as calculated, a good measure of 
correlation (if any) between the class of business (as measured by size 
of policy) and the cost of conducting the business? Give reasons. 

Assuming that any data you require are available, how would you 
calculate an improved coefficient? 


3. Explain briefly the terms “line of regression** and “coefficient of 
correlation**. 

An investigation, based upon the data relative to 500 lives of the 
undemoted age distribution, has been made as to the degree of correlation 
between the age, at the date of ffie investigation and the total sum, 
for which each life is assured. As a result it has been found that the 
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equations of the lines of regression of ^ on a: and of x ony ^ taking the unit 
of jv as jCioo, are respectively 

S^-S 7 y+ i42 = o| 
and 125JC - siy ~ 4^52 = o J * 

Calculate (a) the mean value of^; 

{ b ) the standard deviation of 3;; 

(^) the coefficient of correlation between x and y » 


Age 

No. of lives 

Age 

No. of lives 

31 

29 

41 

25 

32 

28 

42 

26 

33 

28 

43 

24 

34 

28 

44 

24 

35 

27 

45 

24 

36 

26 

46 

23 

37 

26 

47 

23 

38 

26 

'48 

22 

39 

26 

49 

21 

40 

25 

50 

19 


4. You are given the following information in respect of the ages of 
husbands and wives: 


Age group 

No. of 
husbands 

No. of 
wives 

Difference 
between age of 
husband and 
age of wife* 

Frequency 
of occurrence 

IS- 

26 

144 

-IS 

2 

20- 

512 

709 

— 10 

17 

25- 

619 

448 

- 5 

140 

30- 

212 

134 

0 

699 

35- 

72 

56 

5 

526 

40- 

39 

32 

10 

131 

45 - 

28 

21 

IS 

37 

50- 

23 

13 

20 

IS 

55 - 

17 

9 

25 

6 

60— 

13 

5 

30 

3 

65 - 

10 

4 

35 

I 

70— 

75 - 

5 

I 

2 

Total 

1577 

Total 

1577 

1577 


* The difference has been calculated by deducting the central age of the age- 
group of the wife from the central age of the age-group of her husband. 
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Calculate the coefficient of correlation between the age of husband 
and the age of wife. 

5. An office has investigated its mortality experience under Whole 
Life Policies With Profits. The rate of mortality of the non-medical 
business is and of the medical and non-medical business combined 
Given the exposed to risk in the non-medical class to be and in the 
combined classes find the coefficient of corretation between q'^, and 
Hence find the standard deviation of q'x — qx- 

6. The table below gives the values of y observed for fixed values of x 
in eight separate investigations of similar data under similar conditions. 
Calculate the coefficient of correlation between y and x and plot the 
regression line oiy on x. 

Comment on your results in the light of the further information that 
y is 100,000^^, q^ having been arrived at by the normal methods of 
observations of exposures and deaths. 


















CHAPTER IV 


SAMPLING 

1 . We are all familiar with samples in everyday life and the 
purpose of sampling in the theory of statistics is very much what one 
would expect: to obtain information about a large body of data by 
examining a much smaller selection made in such a way as to be 
representative. The work falls into three main divisions, which 
may conveniently be dealt with separately: 

(1) The construction of the sample. 

(2) The analysis of the sample. 

(3) Induction and inference from the results of this analysis. 

The following terms will be used frequently in discussing sam¬ 
pling: 

2. Definitions. 

“Universe’’ or “Population”. The large body of data from 
which the sample is assumed to have been drawn, is known as 
the universey or in actuarial work as the population. 

“ Statistic” and “Parameter”. A function such as a mean, standard 
deviation or coefficient of correlation calculated from a sample 
is known as a statistic; if it is based on the universe it is 
known as a parameter. 

“Errors” and “Deviations”. It will be found that we often 
have to consider the difference between the value of an index 
derived from a sample (a statistic) and the value derived from 
the universe (the parameter). This difference is referred to in 
statistics as the error and does not imply that any mistake 
has been made. The term deviation, which is sometimes used, 
is perhaps preferable. 

3. When the sample is all the available data. 

Quite often the selection of data has already been done by force 
of circumstance and we are given our sample “ready-made”. For 

F M A s iii 6 
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instance; actuaries wish to know how mortality changes from time 
to time, and in this country the samples with which they work 
consist of returns made by British Life Offices, the results of censuses 
and returns of births and deaths, and so on. 

A point which usually causes difficulty to the beginner is that 
functions such as rates of mortality derived from the whole (or 
nearly the whole) of the available data should nevertheless be 
regarded as values derived from a sample and subject to sampling 
errors. 

It should be borne in mind that the purpose of a mortality 
investigation is not to obtain a record of mortality actually ex¬ 
perienced but to produce rates of mortality which would probably 
have been revealed if the data had been unlimited in extent. 

For instance, the years 1924-29 had their own peculiarities, such 
as epidemics, unusually severe weather, freedom from any major 
war, and so on. The object of an investigation into the data for these 
years for producing the A 1924-29 Table was not to eliminate all 
these features but to form an estimate of the results in an “average/' 
year. This “average" year, like the “average" man, does not exist. 
Although the whole of the available data were apparently used for 
the A 1924-29 Table and in similar investigations, sampling errors 
were bound to arise because of limitations of numbers and scope. 

Any population will be subject, from year to year, to random 
fluctuations, which will appear as sampling errors in any investiga¬ 
tion involving the population. 

Similarly, in the construction of the English Life Tables the data 
should be regarded as a sample even if every life in England and 
Wales had been correctly observed and the facts relating to that 
life correctly incorporated. The force of this will be more fully 
appreciated when we come to consider graduation. 

4. Random sampling. 

When the sample has to be constructed the ideal would be to 
form a microcosm similar in all respects to the universe but, of 
course,, on a very much smaller scale. As we have only limited 
information about the universe (otherwise we should not need a 
sample) this is impossible, but a good approximation can be 
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obtained by selecting the constituent elements of the sample at 
random from the universe. We may say that a sample of n in¬ 
dividuals is taken at random from a universe if all possible samples 
of n had an equal chance of being selected. 

This sounds a simple matter, but in practice it is exceedingly 
difficult to avoid bias owing to personal idiosyncrasies or other 
factors even more difficult to identify or control. If it is possible 
to assign a number or other identifying symbol to each member of 
the universe a good way of forming a random sample is to draw 
tickets with the numbers on them from a drum in which they have 
been thoroughly mixed. This can rarely be done and bias nearly 
always creeps into the work, though its presence may not be 
detected. To appreciate the practical difficulties it is essential to 
read accounts of actual investigations, and the student is strongly 
recommended to study two very interesting papers on “Enquiry 
by Sample** contributed by Mr J. Hilton to the Royal Statistical 
Society and reproduced in “Reprints 1938**. 

5. Systematic methods of sampling. 

The student should appreciate that a random sample may be 
constructed by a systematic process. For instance, a political 
agent who wished to form an estimate of the strength of the parties 
in a given district might instruct canvassers to call at every tenth 
house and provided there were no deviation from this strict rule 
the sample would probably be random since there is no reason to 
expect political views to be associated with number of the house. 

So long as the sample is random with respect to the character or 
characters to be measured it is likely that this method will lead to 
a truly representative selection of individuals. 

6. Stratified sampling. 

Here the body of data is split into groups or “strata** by pur¬ 
posive means and one or more representatives from each group 
are selected at random. 

Suppose, for example, that a manufacturer of electric-light bulbs 
wishes to know what current passes through them at a given standard 
voltage, what candle-power they develop, and how many hours 
“life** they have. To do this he might select a hundred bulbs by 
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walking into the stock room and picking out a bulb here and there, 
thus hoping to obtain a random sample. Almost certainly bias would 
be much greater than he realized: for instance, he would probably 
tend to pick from the centre rather than the edges, from the top layer 
rather than any other and to pick clean bulbs rather than dirty ones. 
None of these points may be of importance, but the sample could 
not be described as random and results deduced from it might be 
misleading. For a purposive sample he might take every hun¬ 
dredth bulb turned out by each machine. On the face of it the 
sample thus produced would be representative in that personal bias 
seems to have been eliminated. The danger of this method is, 
however, that the ‘‘period** of the sampling may coincide with 
some “period** in the output. For instance, if a machine has five 
moulds for making the glass bulb itself, a sample formed by 
selecting every hundredth bulb would mean that only one mould 
was really being tested. Again if the count started each day when the 
first shift went on duty the first ninety-nine of each day’s output 
would never be represented in the sample. If a guarantee of 
performance is given by the manufacturers it is important that all 
the bulbs, including those made at the beginning of the day, should 
be properly represented in the sample. 

Alternatively the output of each machine might be taken as 
a stratum and specimens selected from each by some truly random 
method, as far as possible independent of any human operator. 
This would be stratified sampling^\ 

As an example of the pitfalls which would trap the unwary let us 
suppose that the works manager decided to take a bulb from each 
machine every time his telephone bell rang-. As there is no apparent 
connection between his telephone calls and the quality of the output 
this might at first appear a satisfactory way of constructing a sample. 

There are, however, many serious objections, such as the fact 
that telephone calls are not usually distributed over the working day 
in a random manner, while the selection of bulbs from all machines 
at a given time would tend to introduce correlation which would be 
difficult to assess or control. 

A better method would be to decide the times at which drawings 
were to be made for each machine independently by drawing 
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numbered tickets from a drum. In many factories, however, each 
article is numbered and a satisfactory and convenient way of 
constructing a sample would be as follows. Suppose the output of 
one machine for a given day had numbers H 43295 to H 43726 while 
the output of another had numbers K 83962 to K 84403. For the 
first machine discs with one of the numbers 0-9 on each could be 
drawn from a bag and replaced after the number had been recorded, 
thus producing a random series such as 

4396254387.. .. 

If these were marked off in threes we should obtain 

439.625.438.. .. 

as the last three figures of the numbers of the bulbs to be drawn; 
i.e. we should select 

H43439, H43625, H43438, ..., 

any number less than H 43295 or greater than H 43726 being ignored. 
Similarly, each of the other machines could be dealt with and the 
sample obtained in this way should be free from serious bias. 

7. Simple sampling. 

There is a particular form of random sampling, known as simple 
sampling, which is of special importance. The reader will be familiar 
with problems in probability involving the drawing of balls from 
an urn or cards from a pack when the object drawn is always re¬ 
placed before the next drawing. The essential feature of such a 
problem is that the probability of drawing any one object is the 
same as that of drawing any other and this probability is the same 
when the last draw is made as it was at the beginning. 

In simple sampling it is assumed that every individual in the 
universe is equally likely to be chosen in the sample and that the 
universe is so large that when the sample is extracted the remainder 
is to all intents and purposes the same as the original universe. Thus 
the chance of drawing a particular individual for inclusion in the 
sample at the nth draw is the same as it was when the sampling 
commenced. For simple sampling it is also assumed that the chance 
of drawing any individual is independent of that of drawing any 
other. 
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This should be borne in mind in considering mortality statistics, 
which are usually far from homogeneous and which are affected 
in such a way by wars, epidemics, etc. that it is not true to say that 
the chance of death of any one individual is independent of the 
chance of death of any other. 

8. Analysis of the sample. 

This is usually a straightforward matter and the methods pre¬ 
viously described are used to calculate convenient indices to express 
the main characteristics. The algebraic functions, such as the 
mean, standard deviation and coefficient of correlation, are most 
commonly used, as they are good estimates of the parameters 
involved in normal theory. 

9. Inference and deduction from the sample. 

There are three main types of problem: 

(а) when both the universe and the given sample are available 
for analysis; 

(б) when only a sample is given; 

(r) when two or more samples are given but the universe is 
unknown. 

(a) is probably the least important, (J) is most commonly met 
with in actuarial work, and (c) is common in enquiries into social 
questions, education, housing, hygiene, and so on. 

Under (a) the question to be considered is whether the given 
sample is likely to have been drawn from the given universe by 
random or other unbiased sampling. We might, for instance, be 
given records of the heights of a hundred adult males and also the 
mean height and standard deviation of the height of all adult 
males in England and Wales. 

If we found the mean and standard deviation for the sample it 
is extremely unlikely that they would coincide exactly with the 
figures for all England and Wales (our "universe”). For a sample 
of ten we should be prepared for large discrepancies, for a sample 
of five hundred we shoqld, as we say, expect “average” results. 
The point to be decided is what sort of discrepancies are likely to 
arise in a sample of a given size if it were selected in some unbiased 
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way from the universe. If we found that the sample values differed 
from those for the universe by improbable amounts we should 
suspect that the sample was not taken from that universe (e.g. they 
might be a hundred Norwegians) or else that bias had crept in 
(e.g. by selecting the sample from a particular district or a particular 
trade). 

(b) When only a sample is available we use it to estimate con¬ 
ditions in the universe by testing a hypothesis or set of hypotheses 
about the universe. A very common example is testing an estimate 
of a parameter (universe value) by comparing it with a statistic 
(sample value). We might, for instance, wish to test a theory that 
the average height of males over 21 years of age in England and 
Wales was 5 ft. 8*3 in. by measuring a representative sample of, say, 
two hundred and calculating the sample mean. 

In mortality statistics the hypothesis specifies that the rates of 
mortality in the population are those given by the graduated table, 
and that any discrepancy between the rate derived from the data 
\and the graduated rate is due merely to sampling errors. We can 
find approximately the probability that the observed discrepancy or 
one still greater would arise in this way, a large discrepancy corre¬ 
sponding to a small probability. If the probability is very small 
we suspect that the hypothesis about the population rates is 
I fallacious; we suspect that the rates of mortality in the graduated 
table do not represent the facts as far as we can judge from the 
lobservations made. It may seem that (a) and (6) are almost in¬ 
distinguishable, but there is a fundamental difference. In (a) the 
parameter is known and the sample is being tested. In (6) the 
parameter is not known but an estimate of it (often based on the 
sample) is being tested by means of the sample. 

(c) When we are given two or more samples we try to ascertain 
whether the universes from which they were taken are likely 
to be the same (or similar) or whether they are different. For 
instance, the drug M and B 693 has been found to give good 
results in the treatment of pneumonia. As it is impossible to 
collect statistics of all cases, whether treated by the drug or not, a 
random sample might be taken of, say, two hundred cases so treated 
and another sample of two hundred as far as possible similar to 
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the first in every respect (e.g. age distribution) and subjected to 
identical treatment except for the administration of the drug. As 
a result of the analysis of the samples we might be able to say that 
the differences between the two samples could not be explained 
j as due to paucity of data and sampling errors, but were probably 
) due to a real difference in the universes from which they were taken. 

In (a), (b) and (c) we try to arrive at a probability that sampling 
would give rise to an error equal to or greater than the error observed. 
This usually causes difficulty to the student, who asks, “Why not 
try to estimate the probability of obtaining the observed error owing 
to paucity of data? Why include larger errors?” The reason is that 
the probability of obtaining a given error is meaningless if the 
variable is continuous (as it usually is). 

It will be remembered that in applying Calculus to questions in 
Probability we had to deal with probabilities relating to small 
intervals. Thus on p. 308 of Mathematics for Actuarial Students, 
Part II, a typical example involves the probability that a man 
arrived in London between t and I + dl from the beginning of the 
year. The probability that he arrived at exact time t is zero. 

Similarly, in statistics there cannot be the probability of obtain- 
in a given sampling error if the variable is continuous; what we 
evaluate is the probability of obtaining an error greater than that 
actually observed. As the student will no doubt expect, this 
function is represented by an integral. Before however we can 
deal with this integral we must consider sampling distributions. 

10. Sampling distributions. 

Suppose that from a universe of N we take every possible sample 
of «. There will be such samples because, although a particular 
value or event may be repeated several times in N, each of these 
values or events may be considered as a separate “individual” for 
the purposes of sampling. For each sample we can calculate some 
index such as a, the standard deviation of the sample, and group the 
results into a frequency distribution, which we can if we wish 
represent graphicdly as a histogram. If the number of samples is 
very great the class-interval used in drawing the histogram can be 
reduced. By taking a very large number of samples, each of n 
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individuals, and calculating the standard deviation of each, we shall 
ultimately arrive at a smooth curve showing how the values of a 
are distributed. 

Similarly, we could find how the means of the samples were 
distributed and could arrive at, or at least approximate to, a smooth 
curve representing this distribution of the sample means—the 
‘‘sampling distribution of the means**. In the previous paragraph 
we discussed the sampling distribution of the standard deviation; 
if two variables were present we could imagine a sampling distribu¬ 
tion of the coefficient of correlation, one value being calculated for 
each sample. 

From the above it seems necessary not only to calculate a given 
statistic (say the mean) for the sample of n observations but to 
construct all the other possible samples of n individuals and calculate 
the statistic for each in order to form some idea of the sampling 
distribution. 

11. Normal sampling distributions. 

A considerable literature exists about sampling distributions. 
In most instances, however, it is assumed that the populations 
follow the normal curve. Actual experiments based on extensive data 
indicate that this assumption is a good approximation in most cases 
although it is unlikely that many distributions are really accurately 
represented by any such simple law. 

For practical purposes we nearly always assume that sampling 
distributions are normal. (An important exception will be men¬ 
tioned later when we come to the test.) This assumption is 
usually approximately correct if w, the number of values involved 
in the statistic, is large but it should be made only in large sample 
theory. 

12. Biased and unbiased estimates. 

The expected value of a statistic can be expressed in terms of the 
population parameters. On intuitive grounds we use the observed 
values of statistics as estimates of population parameters. If the 
population parameter which it is desired to estimate is, in fact, 
equal to the expected value of the statistic employed the latter is 
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said to be an unbiased estimate of the parameter. For example, the 
mean of a sample of n values is an unbiased estimate of the 
population mean. 

On the other hand, the sample standard deviation, s, is not an 
unbiased estimate of the population standard deviation, a. Such 
an estimate is called a biased estimate. 

To appreciate why r is a biased estimate of a we note that, in 
calculating s the deviation of each observed value is measured from 
the sample mean while, in calculating a the corresponding deviations 
are measured from iht population mean. The mean square deviation 
of n values is a minimum when measured from the mean of those 
values, so that s will, on the average, be less than the parameter a 
and will be a biased estimate. For large samples, however, this bias 
is negligible and we shall assume in this book that all estimates used 
are unbiased. 

13. Standard error. 

Although, as we have seen in the previous paragraph, the expected 
value of a statistic in large sample theory may be assumed equal to 
the parameter a little consideration will show that the standard 
deviation of the sampling distribution, which is defined as the 
standard error of the statistic, will not approximate to the parameter 
a, the standard deviation in the population. Thus the statistic “the 
mean of a sample of n individuals” will tend to be a much more 
stable quantity than the individuals themselves and hence the 
standard error of the mean should be much less than a. As we shall 
see, the standard error of the mean is aj^n. 

It is important to notice that, apart from bias, the accuracy to be 
expected of an estimate is important. For an unbiased estimate this 
will be appropriately measured by the standard error of the 
statistic employed. 

Expressions have been obtained for the standard errors of most 
of the well known statistics but the theoretical work involved is 
usually difficult and we shall content ourselves with dealing with 
two of the simplest: a class frequency and the mean. The results 
for other statistics are quoted in section 16 of this chapter. 
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14. Standard error of a class frequency. 

Sometimes the universe is divided into different classes and we 
may be interested in the numbers in the various classes or class 
frequencies in a given sample. For instance, women may be 
divided into single, married and widowed (including divorced). 
Obviously the class frequencies shown in a sample will not be 
comparable with the class frequencies in the universe, and for 
this reason we usually deal with “proportionate class frequencies’’ 
found by dividing the class frequencies by the total number in 
the sample or the universe as the case may be. Thus, if out of a 
sample of 250 women 104 were married we should say that the 
class frequency for married women was 104 and the proportionate 
class frequency *416. 

Let us assume that for a particular class the proportionate 
frequency in the universe is q and that we draw a random sample of n. 

Assuming that the laws of simple probability apply, the chance 
that all n are in the given class is q^. Similarly, the chance that n — i 
are in the given class and one is not is ‘^Ciq^‘~^py and, generally, the 
chance that r are in the given class and the remaining n — r are not 
is '^Cy<fp^y where p — i—q. 

Hence, if we took a very large number S of random samples, 
each of n observations, we should expect the class frequencies to be 
distributed as follows: 

No. of samples in which 
Class frequency the frequency is oh served 

n Sq” 

n—1 S’'Ciq”~^p 

n-2 S”C2q’^Y 

r S”C^qY'^ 

2 S”C„i 29 Y-* 

X S-C„_,qY-^ 

o Sp^ 

Total 5(/> + y)" = 5 

Our sampling distribution is in fact the successive terms in the 
expansion of 5(5+/>)”. We know therefore that the mean is nq and 
the standard error ^tipq (see Chapter II, pp. 32, 33). 
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If instead we had considered the proportionate frequency we 

should have had a mean value q and standard error /—on /nth of 

\ u 

the values for the class frequency itself. 

We therefore have the important results for a sample of n. 


Class frequency 


Mean value Standard error 

.(0 


Proportionate class frequency q 



( 2 ) 


Note. Although the mean value of the class frequency is not of course 
the class frequency in the universe, the general rule does apply for the 
proportionate class frequency, which is independent of the number 
concerned. 


IS. Normal approximation to sampling distribution of a pro¬ 
portionate class frequency. 

Having obtained the mean and standard error of a proportionate 
class frequency as above, we usually assume that the sampling 
distribution roughly follows the normal curve. This seems at first 
inexplicable when we know that the actual distribution is the 
successive terms of The chief reasons for the assumption 

are as follows. By a suitable choice of origin and scale any normal 
distribution can be reduced to the standard form = for 
which exhaustive sets of tables are available. The curve is in fact 
adequately “mapped’*. The distribution {q+pY cannot, however, 
be reduced to a simple standard form and, except in special 
instances, the numerical work of expansion is prohibitive. The 
second reason is the fact that the normal curve is quite a good 
approximation to the binomial distribution provided that either q 
is roughly equal to p ox n is very large. (The normal curve is of 

x*n 

course =3^0^ where the origin of x is taken at q, the mean.) 

The following tables, given by H. L. Seal in his interesting 
paper “Tests of a Mortality Table Graduation” {J.I.A. Vol. Lxxi), 
illustrate this point. The notation has been altered and the way in 
which the ordinates have been selected for comparison will be 
apparent only after the original paper has been read. 
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Comparison of terms of {p^q)^ with ordinates of the Normal Curve 


^ = •0025 

w=40oo 

9 =*005 

n = 2000 

q = ’Ol 

w = 1000 

Binomial 

Normal 

Binomial 

Normal 

Binomial 

Normal 


•8742 

•8746 

•8741 

•8743 

•8741 


•6348 

•6352 

•6344 

•6343 

•6336 


•4286 

•427s 

•4280 

•4264 

•4269 

•2651 

•2678 

•264s 

•2672 

•2633 

•2660 

•1500 

•1542 

•H 9 S 

•1537 

•i486 

•1527 

•0776 

•0816 

.0773 

•0812 

•0766 

•0805 

•0371 

•0396 

•0369 

*0393 

•0365 

•0388 

•0169 

•0176 

•0168 

•0174 

•0165 

•0171 

•0076 

•0071 

•007s 

•0070 

•0074 

•0069 

•003s 

•0026 

•0034 

•0026 

•0033 

•0025 

•0016 

*0009 

•0015 

•0009 

•0015 

•0008 

*0007 

•0003 

•0007 

•0003 

•0007 

•0003 

•0003 

•0001 

•0003 

••0001 

•0003 

•0001 

•0001 

•0000 

•0001 

•0000 

*0001 

•0000 

•0000 


*0000 


•0000 



Comparison of terms of {p^k-qY with ordinates of the Normal Curve 


^=•03 

« = 333 

^=05 

ff = 200 

^=•1 

n=ioo 

Binomial 

Normal 

Binomial 

Normal 

Binomial 

Normal 

•8730 

•8724 

•8716 

•8711 

•8681 

•8676 

•6308 

•6299 

•6272 

•6265 

•6178 

•6171 

•4216 

•4219 

•4168 

•4173 

•4042 

•4047 

•2584 

•2609 

•2536 

•2562 

•2410 

•2434 

.1444 

•1483 

•1405 

•1443 

•1301 

•1336 

•073s 

•0773 

•0708 

•0744 • 

•0636 

•0668 

*0345 

•0368 

•0328 

•0350 

•0284 

•0303 

•0154 

•0160 

*0144 

•0150 

•0119 

•0124 

•0067 

•0063 

•0062 

•0058 

•0049 

•0046 

•0030 

•0023 

•0027 

•0021 

•0020 

•0015 

•0013 

•0007 j 

•0012 

•0007 

•0008 

•0005 

•0006 

t 

•0002 1 

•0005 

•0002 

*0003 

•OOOI 

•0002 

•0001 

•0002 

•OOOI 

•OOOI 

•0000 

•0001 

•dooo 

•0000 

•OOOI 

•0000 

•0000 

•0000 
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It will be seen therefore that within fairly wide limits values 
derived from the normal curve are good approximations to the 
true binomial values. This is particularly so in the neighbourhood 
of the mean (the origin). 

16. Standard error of the mean. 

The second index with which we shall deal is the mean of the 
sample, and we shall first prove the almost self-evident fact that 
the mean of all the sample means is the mean of the universe. 
Suppose that the universe consists of N variates denoted by 
^2y the values being measured from some convenient 

origin. 

The number of different samples of n each which can be con¬ 
structed is which for convenience we shall denote by m. 
This applies even if several of the w’s are equal, since each observa¬ 
tion has a separate individuality although its values may coincide 
with that of some other observation. 

Denote the values in the rth sample by 

^r:2> ^r:3> ••• ••• ^r:n* 

Then the sample mean 

The mean of all the m sample means is therefore 

TTITI 

Clearly each u appears in of these brackets (once in each), 

thus making the total number of terms N = n 

as it should, there being m samples of n each. 

Hence the expression for the mean of the sample means reduces to 

(% + ^2 +... + u^). 

tnti 

Since m=^C„, this becomes 

^(«i+Ma+«8+ •••+%) 
or the mean of the universe. 
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To find the standard error of the sample means, take the universe 
mean as origin and denote the observed values measured from it by 

N 

^2> ••• ••• SO 

1 

The mean of the rth sample measured from the new origin is 
given by ^ ^ 

It 

using an obvious extension of the previous notation. 

The standard deviation of all the sample means, is given 
by the relation 

(since by the previous paragraph the mean of the M*s is our new 
origin); i.e. 

mn2a^ = {(®l:l + ©1.2+...+®x:„)’' + (®a:i + ®2:2+.-+®2:n)‘‘+.- 

+ («’m:l + ®m:2+"-+»m:n)*}.(s) 

If we expand the right-hand side and collect like terms it will 
reduce to the form 

(z;f++... + + 2 J5 (t;i tyg+^3 + • • • + ^8 + • • • )> 

since although for convenience we have used two suffixes to denote 
the sample considered and the observation in the sample, there are 
in fact only N different v's and each of them will occur in many 
samples. 

Any given v will occur in samples and therefore in this 

number of brackets on the right-hand side of (3). Hence on col¬ 
lecting like terms the coefficient A of each will be 
Similarly, any two given v's, Vj. and Vg say, will occur in 
samples, and hence the coefficient zB of each term such as Vj.Vg will 
be 

We have therefore 

(of++...++ 2 (oj 02+...). 

.( 4 ) 

Since the o’s are measured from the mean: 


•’ Oi + 02+...+l'iV = 0. 

2(0x02+0x03+.+ 0,0,+ ...)=-(®f+®i 4 -... + o|^). 
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Substituting in (4): 

But where a is the standard deviation 

of the universe and so that we finally have 


N-ir* ^N-2n \T 
__ ^n~l ^n-2 _< 

CTm- - “ 2 ^ 


N—na^ 
"N— I n ’ 


( 5 ) 


If N is large we can assume that N—n is not very different from 
N—i and write ^ 

— .(6) 

yjn 

This gives us the standard error, i.e. the standard deviation of 
the sampling distribution of the mean in terms of a the standard 
deviation of the universe. 

Similarly, expressions have been obtained for the standard errors 
of most of the well-known statistics, but the proofs are generally 
difficult and beyond the scope of this book. They are set out for 
convenience in the following table and should be memorized. 


Standard errors of well-known statistics based on samples of n 


Index 

Standard 

error 

Remarks 

Class frequency {nq) 

^Inpq 

q is the proportionate class 

Proportionate class 

fpq 

frequency in the universe 

frequency {q) 

V n 

and p^i—q 

Mean (M) 

O’ 

. a is the standard deviation of 

Standard deviation (a) 

O’ # 

V 2 n 

the universe 

J 

Coefficient of corre¬ 

I — 

r is the coefficient of corre¬ 

lation (r) 

v« 

lation in the universe 


The mean class frequei^y in nq and the means of the other 
indices are the correspondinV values in the universe. 

Of \ 

The formula -r= for the standard error of a standard deviation 
Van V 

• These faimulae are approximate and should generally be used only if « is 
large. i 
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applies exactly only if the standard deviation is itself derived from 
a normal distribution. It does not therefore strictly apply to the 
standard deviation of a binomial distribution, although in practice 
it is so applied. The general formula for the standard error of a 
standard deviation is 

which however is largely of theoretical interest. ^ 

We are now in a position to deal with the types of problem 
mentioned on p. 86. 



17. When information about the universe is available. 

As has been stated, the problem is to find whether the sample is 
likely to have been obtained from the universe by random or other 
unbiased sampling. To do this we examine those statistics in which 
we are particularly interested; usually these include the mean and 
the standard deviation. 

Suppose, for instance, that a headmaster wished to know how the 
standard of English teaching in his school compared with that of 
the general body of schools and decided to compare the results of 
(say) the School Certificate examination, treating his own school as 
a sample. 

He might decide to take four “classes’^ under 25% marks, 
25-49 % marks, 50--74 % marks and 75 % marks and over. Suppose 
that he obtained the following results for his school, which sent in 
a hundred candidates for the English papers, for which the maxi- 


mum possible marks were 250: 

School 

Results for 
the whole country 

Mean mark 

165 

170 

Standard deviation 9f marks 

22 

20 

Proportion with less than 25 % marks 

18 per cent 

16 per cent 

n 25-49% inarks 

30 

26 

50-74% inarks 

38 

48 

„ 75% and over 

14 .. 

10 


We know that the standard error of the mean is given by the 


formula cr/Vw which in this e]|fample gives there are 

100 candidates. 


FMAs iii 


7 
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We now assume that if all possible samples of 100 were taken, the 
mean marks found from them would be normally distributed with 
a mean of 170 (the mean mark for the universe) and standard 
deviation 2*0. 

The actual sample has a mean 165, i.e. 5 less than the universe 
mean. f=2*5, so that the difference between the universe and 
sample means is two-and-a-half times the standard error. 

Now we know from Chapter II, p. 36, that about 95^5 per cent 
of the area of a normal curve lies between the ordinates ± 2(t and 
99*7 per cent lies between the ordinates ± 3a, so that the chance of 
a random value lying further from the mean than 2*50 is small 
(say 3^ approximately). 

In other words, the results for that year indicate that the school 
is below the general level to an extent which is unlikely to be due to 
sampling errors. 

Similarly, the standard error of the standard deviation is in 

\2n 

this example, = 1*4 approx. 

V200 

As the sampling distribution has a mean of 20 and a standard 
error of 1*4 we expect most of the sample values of a to lie within 
± 3 X 1*4 of the value 20. Actually the value 22 calculated for the 
given school lies about one-and-a-half times the standard error 
from the universe value and we cannot say that the school is 
abnormal in the “scatter*' of its marks. 

Similarly, we can compare the four proportionate class frequencies. 
Take, for example, the class 50-74 marks, for which the propor¬ 
tionate class frequencies are *38 for the sample and *48 for the 
universe. 

The standard error of the sampling distribution of the propor¬ 
tionate class frequency is 

lE^ /MBS 

n 100 

= '05. 

The absolute magnitude of the difference between the proportionate 
class frequencies for the sample and the universe is 

•48—*38 = *10. 
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This diflFerence is twice the standard error and is probably 
therefore significant of a real difference between the sample and 
the universe. 

Similarly, the other classes could be dealt with and the final 
conclusion would be that as regards that year’s results the school 
was below the average. 

(In this last section we have tacitly assumed that the normal law 
holds for q. The inaccuracy cannot be great.) 

18. When only the sample is available for analysis. 

In the second type of problem we cannot evaluate the standard 
errors accurately since we do not know the parameters. Accordingly 
we have to fall back upon the sample values as approximations and 
s* 

take, for example, -- for the standard error of the mean, where s' is 
the sample value of a. It can be shown that it is more satisfactory 

s' (7 

to take -- p- = for the standard error of the mean, using the form 

Vn~i sn 

only when the parameter a is known. The difference produced by 
replacing « by n — i is only appreciable when n is small and for most 
purposes can be ignored in actuarial work. For a proof th^ reader is 
referred to An Introduction to the Theory of Statistics^ (see Biblio- 
graphy). 

Hitherto we have assumed the normal curve giving the sampling 
distribution to be known since we have had the mean and standard 
error given. 

Now, however, the standard error is only approximate and we 
do not know the value which should be assumed for the mean of 
the sampling distribution. 

Consider the following diagram: 
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A represents the value of the statistic considered. We calculate ^ 
the standard error s' of the statistic, using the sample values of 
any necessary constants, and assume that this is a sufficiently good 
approximation to the true standard error. 

We then measure distances OiA, AO2 equal to 2s' and draw 
normal curves with Oj and Og as origins and with s' as standard 
deviation. The true sampling distribution might follow either of 
these curves or any curve of the same shape in between them, but 
the origin of the true curve is unlikely to lie outside the range OjOg. 

To see why this is so consider the curve with mean If it were 
the sampling curve the error OiA of the sample value would be 
twice the standard error. 

Now although such an error may arise it is rather unlikely to be 
exceeded. Similarly, if the true curve were the one with mean O2 
we should have an error AO2 equal to 2s', 

It should not be assumed that 2s' is a sort of “foot-rule’* to test 
whether an error is likely or unlikely, but for many purposes it is 
convenient to regard it as a likely upper limit. An error of as much 
as 35', 4^' or even 1505' may occur, but while some authors use 35' 
as an upper limit to the error likely to arise, such a criterion is 
severe. It will be remembered that if the sampling distribution is 
in fact normal the chance of an error arising as large as or larger 
than 35' is only about -0027 (see p. 36). 

We may say therefore that, if A is the value of an index calculated 
from a sample and s' is the approximate standard error, the true 
value of the index is unlikely to lie outside the range 
A-2s' to A + 2s' 

and very unlikely to lie outside the range 
A-2^' to -4 + 3^'. 

Example 1 

On p. 65 of Chapter III we found a coefficient of correlation 782 for 
a sample of 365. 

The approximate standard error is therefore — t====-^ =*020. The 

V36S 

true value of r found from unlimited observations may be as low as 
782-2 (’020), i.e. 742, or even 782-3 (’020), i.e. 722, but is unlikely 
to be less. Thus very marked correlation does exist and the value of the 
parameter can be fixed within fairly narrow limits. 
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19. Test of a hypothesis. 

Although the above method of approach is often used to make 
deductions from sample values, it is more logical to proceed on the 
lines previously indicated, i.e. to decide on a hypothesis about the 
universe and test its validity by means of the sample. 

Thus in the previous example we might adopt the hypothesis 
that the true coefficient of correlation was only *74. Taking the 

standard error as —=*024, we proceed as follows: 

V36S 

observed value — assumed value = -782 — 74=-042. 

The probability that a value chosen at random lies within a 
distance *042 of the mean 

= - y.— ■■ I e 2 (r*dx, 

V 27ra Jo 

Putting jc = a.v' = -024Jc', this reduces to 


•024, we proceed as follows: 


2 _ p ^ 

Jztt J 0 


dx\ 


Table I in the Appendix gives the value *9205 for this integral. 

The probability that a value differs from the mean by more than 
•042 is therefore only i — -9205 = *0795 and the chance that it is 
less than the mean by more than *042 is half this, i.e. -04 approx. 

This is so small that our hypothesis seems improbable and the 
true coefficient of correlation is unlikely to be as low as 74. 

20. Probability levels. 

For the normal curve in its simplest form 

I 

2 


the probability that a value chosen at random differs from the mean 
by more than JK', say, is given by the expression 

^ I 

1-2 ~i=e ^ dx. .( 7 ) 

Jo V27r 
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Hitherto we have found the probabilities for given values of K, 
but for many purposes it is convenient to fix the probability and use 
equation (7) to find the corresponding value of K. 

Tables have been prepared for various probability levels (as they 
are called), the values most commonly used for the probability 
being 50,5, i and • i per cent. At the foot of Table I in the Appendix 
figures are given for twelve useful probability levels. 

Probability levels are frequently met with in connection with 
guarantees given by manufacturers as to the performance of their 
products. For instance, a firm selling wire cables might know, as 
a result of a large number of tests, that the breaking strain of a par¬ 
ticular type of cable followed roughly the normal distribution with 
mean M and standard deviation a. A guarantee of performance is 
usually essential and it may be decided to fix it so that not more than 
one rope in a thousand would be returned as unsatisfactory. 

Suppose that the appropriate guaranteed strain is M—Kcr, By 
hypothesis the probability that a cable chosen at random has a lower 
breaking strain is Since the distribution is roughly normal 
this is also the probability that a cable has a breaking strain greater 
than M+Kg. 

Hence the chance that a random value differs from the mean 
by Ka or more is or *002. 

The prepared tables show for the normal curve with unit standard 
deviation what value of K corresponds to the probability level *002. 
Having obtained the value of K in this way the manufacturer could 
guarantee a breaking strain of M—Ka with the knowledge that only 
about one rope in a thousand would be rejected as below standard. 
Incidentally only about one rope in a thousand would stand a 
higher breaking strain than Af+Aa, but this would not be of 
practical importance. 

It is important to remember that prepared tables do not dis¬ 
tinguish between values greater or less than the mean. They give 
probabilities that a random value will lie within a given distance of 
the, mean (either above or below it) and care should be taken to 
allow for this. For instance, in the above example the probability 
of xi^ taken as the standard had to be doubled so as to include the 
probability of breaking strains in excess of the mean. 
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21« Ftirther remarks on correlation. 

It will be no^ced that Example i deals with correlation. It is in 
this connection that tests for what is known as “significance” are 
particularly important. If any two series of numbers are written 
down at random, or if two quite unconnected sets of pairs are taken 
(e.g. number of the house and salary of the owner), the usual process 
will give us a coefficient of correlation which may be positive or 
negative, but is very unlikely to be exactly zero. A significance test 
would almost certainly show, however, that the whole of this 
apparent correlation may be due to errors of sampling and that the 
universe value of r is probably zero. 

Before attempting to draw any conclusions it is essential therefore 
to test all coefficients of correlation by comparing them with the 
standard error. i/V« which is appropriate in this case since our 
hypothesis is that correlation in the universe is zero. Care must be 
taken in the interpretation of correlation coefficients even when 
they are found to be significant. 

Suppose that an unskilled investigator wished to find if births 
and deaths were correlated and obtained his data by taking the 
births and deaths in a number of towns in a given year. The births 
and deaths might quite well show a marked degree of correlation 
which, while not spurious in itself, might cause some very mis¬ 
leading results to be drawn. In actual fact the number of births is 
the result of two factors, the birth-rate and the population of the 
town (or more correctly the female population of child-bearing 
ages), and similarly the number of deaths reflects the product of 
the death-rate and the population of the town. The result of the 
investigation is very probably due therefore to the correlation of the 
“weights”, viz. the female population of child-bearing age and the 
total population of the town, and not to any interdependence of 
the fundamental birth-rates and death-rates. When the functions 
dealt with are composite and reflect the combined operation of a 
fundamental variable and sets of “weights” such as the populations 
aboyp, any correlation shown may arise simply from the “weights” 
and should be further analysed. 

In statistics we sometimes meet time series which may be defined 
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as records of the values of variables taken at regular intervals of 
time. Correlation between two series is often spurious in that it 
reflects only the result of changing conditions on two quite unrelated 
variables. In an address given to the Royal Statistical Society 
(J.R.S.S. Vol. Lxxxix, p. i) G. Udny Yule gives an interesting 
example of what he calls “nonsense correlation*^ between two 
series. The first showed the proportion of Church of England 
marriages to all marriages in the years 1866 to 1911, while the second 
was formed by the standardized mortality of 1000 persons in England 
and Wales for the same years. The coefficient of correlation was 
+ '95» with a standard error of only ‘014. This is, however, due to 
the fact that over the period considered both the variables decreased 
fairly steadily, so that both might be said to be “correlated** with 
time. It is probable that some of the correlation in Example i of 
Chapter III is spurious, since we are dealing with two time series 
over the years 1810 to 1834. 

22 . Comparison of two samples. 

For the satisfactory comparison of two samples it is usually 
desirable to compare means, standard deviations, and, in some 
cases, class frequencies. Generally, tests for significance have to be 
restricted to the functions whose standard errors can be found. One 
obvious way of comparing the two sample values of a given index 
(say, the mean) is as follows: 

If Ml is the value for one sample, and oi its standard error, then 
the “true** value of the index is unlikely to be outside the range 
Mj—201 to Mi + 2(Ti, and still less likely to lie outside Mi —3o‘i to 
Ml 4-301. Similarly, if Mg and Og refer to the second sample, the 
“true** value for the universe from which it was drawn should be 
between Mg—202 and Mg4-2org or, at any rate, between Mg —303 
and Mg 4-303. If the ranges M1-201 to M1+201 and Mg-20g to 
Mg4-20g do not overlap the two samples are very unlikely to be 
drawn from the same universe, because the parameter cannot lie 
within a distance 201 of Mi and at the same time lie within a distance 
203 of Mg. This test is, however, much too severe for/general use, 
for the following reason. We know from Chapter II, p. 36, that in 
a normal distribution the chance of a value selected at random 
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lying further than za from the mean is i — *9545, i.e. *0455, which is 
small; but if the ranges to Afi+aoi and M^—zoz to 

Af2+2or2 io the above test did not overlap, the true or universe 
values of the index would be the same only if and also 
lay more than twice the standard error from the parameter. The 
probability of this is (•0455)*, or about i in 480. 

To arrive at a more satisfactory test let us consider the difference 
M^—M^ between the two sample indices. 

In Chapter III it was shown that if ar is the difference between 
two variables x andy, <Tg='\la%+al (formula (20)), if no correlation 
exists between * and y, or 

=Va* + a* - 2raj.ay (formula (24)), 
if correlation does exist. 

Now a standard error is merely another name for a standard 
deviation and it follows therefore that in our previous notation the 
standard error of Mi — M2 is Vaf + <^1, provided that we are satisfied 
that correlation does not exist between the two samples. 

A direct comparison of Mi—M2 with will then indicate 

whether there is any strong evidence that the means of the universes 
from which the samples are drawn are unequal. 

Example 2. 

The heights of 100 men are recorded ^d it is found that the mean 
height is 5 ft. 9 in. and the standard deviation 2-24 in. A second sample of 
100 men gives a mean height 5 ft. 7 in. and standard deviation 2*23 in. 

To form an opinion as to whether the samples are drawn from the 
same or different universes we compare both the means and the standard 
deviations. 

Using the formula , the standard error of the first mean is 
\n 

2-24 . 

m. = -224 in. 

Vioo 

Similarly, the standard error of the second mean is -223 in. Assuming 
therefore that there is no correlation between the samples the standard 
error of the difference in the means is V('224)® + (*223)* = *3iS approx. 
The actual difference is 2 in. or about 6 times the standard error and 
this cannot therefore be accounted for by sampling errors. 
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To test the standard deviations we use the formula for the standard 

yzn 

error of a standard deviation. The standard error of the first sample 
2*24 

standard deviation = -p=^in. = *i58 in., and the same value applies 
V200 

approximately for the second sample. Hence the standard error of the 
difference of the standard deviations is approximately 

V(* 158)2 + (• 158)^ = *22 in. roughly. 

The actual difference is only -oi in., so that there is certainly no evidence 
here of significant difference. 

The test of the standard deviations is only of theoretical interest as 
the difference in the means has already given convincing evidence that 
the samples have not been drawn from the same universe. If this test 
had been inconclusive we should have tested the hypothesis that the 
samples came from universes with similar standard deviations. This is an 
instance when the argument discussed on p. 104 could have been 
employed as follows: 

Using three times the standard error as the maximum error likely to 
arise we find that the mean height in the universe from which the first 
sample is drawn is unlikely to lie outside the range 5 ft. 9 in. ± ’672 in., 
i.e. it is unlikely to be greater than 5 ft. 9*67 in. or less than 5 ft. 8*33 in. 
Similarly, the mean height in the universe from which the second sample 
is drawn is unlikely to lie outside the range 5 ft. 7 in. ± ’669 in. The upper 
limit of this range, i.e. 5 ft. 7*67 in., does not fall within the first range and 
hence it is very unlikely that the two universes are the same. 

23. Amalgamation of samples. 

It will be remembered that, in evaluating the formulae for the 

/-^ I— 

standard errors and so on, the values of the indices 

5, <7 and r should be taken from the universe. 

Unfortunately this is impossible because the only available data 
are those included in the sample or samples; we are therefore faced 
with the difficulty of making the best estimate we can. In large 
samplb work it is usually sufficient to take the sample value as an 
approximation to the corresponding parameter but if the number 
in the sample is small it is necessary to reduce bias; thus, if s is the 
standard deviation in a sample of w, we should use s/^Jn-i and not 
sj'^jn as an estimate of the standard error of the mean. 
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If we are considering more than one sample it is important to 
specify clearly the hypothesis which is being tested since this 
affects the way in which any parameters are estimated. Suppose, 
for instance, that two samples of Wj and Wg observations respectively 
with means and tn^ and standard deviations and are being 
considered. We might decide to test whether the means differ 
significantly and to do this we set up the hypothesis that the samples 
have been drawn from universes with the same mean (we do not 
specify that they should have the same standard deviation as we 
are not testing the difference between and ^g)* 

The best estimate we can make of the common mean of the two 
populations is formed by amalgamating them and taking the mean 
(nimi + W2m2)/(«i+n2) of the combined data. The estimates of the 
standard errors of the means of the two populations are then 

sj^rii — I and 

or simply sj'sjui and sJ^Jn2 

if the w’s are large. 

If we assume no correlation we then take as the 

estimate of the standard error of the difference in the means by 
which to test the observed difference \ — 

If, on the other hand, we were testing the same hypothesis 
assuming that the samples were drawn from populations with the 
same standard deviation (a, say) we should amalgamate the samples 
in order to form an estimate of this parameter cr. 

Example 3. 

The following data have been obtained relating to schoolchildren: 

No. with medium 
Total hair colour 

Edinburgh 9>743 4,008 

Glasgow 39>7^ i7>529 

We proceed to test the hypothesis that the samples are drawn from 
populations in which the proportion with medium coloured hair is the 
same. To form an estimate of this common proportion we amalgamate 
the two samples and assume the common parameter q to be equal to: 

4008+17,529 .. 

— 435®* 

9743+39.764 
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The standard error of this value of j is 


iV for Edinburgh (sample size 9743)- 

and / for Glasgow (sample size 39,764). 

39*764 

Hence, if our hypothesis were correct and we took samples of 9743 and 
39,764 respectively from the two populations, the expected value of the 
difference between the proportions of children with medium coloured 
hair would be zero and the standard error would be: 


As the observed difference is 


4008 17,529 
9743 39*764 


= -0294. 


As this is more than five times the standard error we conclude that our 
hypothesis is untenable. 

Before drawing any conclusions, however, we should need to be 
satisfied that the description “medium hair colour** is uniform between 
the two cities. Owing to the difficulties of obtaining a satisfactory 
definition it is likely that different standards were adopted. 


24. Loss of degrees of freedom. 

It should not be assumed that the values of parameters are always 
estimated from the sample. Sometimes the hypothesis itsetf 
specifies such a value. For instance, in the middle of p. 103 we 
considered the h3^othesis that two variables were really uncorre¬ 
lated (a very common hypothesis) and therefore assumed a standard 
error of i/Vw which is obtained by putting r=o in the formula 
quoted on p. 96. 

When the values of parameters have to be estimated from the 
sample the student may feel that this tends to be unduly favourable 
to the hypothesis since the more the test depends on sample values 
the more likely is it to indicate a good “fit** with the sample. This 
point is unimportant if the number of values fitted is large but it is 
of general interest and must be allowed for if only a few values are 
being fitted. We say that degrees of freedom^^ are lost when the 
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values of parameters are estimated from the sample. This phrase 
will be discussed further in Chapter V where examples will be found 
of the adjustments made in applying a certain test for this loss of 
degrees of freedom. 


Example 4. 


The following example illustrates 'how correlation may sometimes be 
dealt with: 


Total live 
births in 

1937 

Town A 956 

Towns A and B combined 1406 


Male live Proportion of 

births in male births to 

1937 total births 

502 *525 

697 -496 


Is there any significant difference in these results ? 


If the proportionate class frequencies are compared directly, correla¬ 
tion is bound to arise, since the proportion in town A influences that for 
the combined towns, and the standard^ deviation of the difference is of 
the more general form >la\+< j\ — zra^ og. If the true value of r is unknown 
the problem may sometimes be solved by giving r its maximum and 
minimum values +i and respectively; a more refined method of 
approach is however discussed in the next section. 

The simplest method of dealing with this problem is to eliminate the 
obvious correlation by subtracting the data for town A from the com¬ 
bined data. We thus obtain for town B only; 


Total live births in 1937 . 450 

Male live births in 1937 . 195 

Proportion of male births to total births -433 


Standard error of proportion of male births in town 


Standard error of difference of these proportions 




525 -475 


956 


433 X -567 
450 


^ /•525x-475^:.4 33 . x-567 ^.^^8 
V 956 450 

The actual difference is *525-*433 = -092. 

As this is between three and four times the standard error it is significant. 

If it is desired to compare the proportion for town A with the proportion 
for the combined towns, the correlation can be allowed for as follows, 
bearing in mind that a given proportion for town A can be associated 
with one and only one proportion for the cpmbined towns. 



no 
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Let qs and represent the proportions of male births for Ay B 
and the combined towns respectively, and let and denote the total 
number of live births in A and B respectively. Let <7^, and <7^+^ 
denote the standard errors of q^y q^ and and let r denote the 

coefficient of correlation between and q^+B* 

Then the standard error of q^’^^AVn 

Vo^ + o-^+B - .(8) 


Now 




Hence, if capital letters denote deviations from the mean, 

r\ _^aQa'^^bQb 

yA+B" ^ ^ > 

^A+^B 

since in considering standard errors we assume samples of the same size 
to be taken, so that and rij^ are constants. 

. ^a(Qa)^ + Qb Qa 


Qa^bQa' 


Ha+TIb 


.(9) 


If we sum for all the N imaginary samples included in the sampling 
distribution we have 


^Qa^bQa-^^^a+b^a* (from definition of r) 
while from equation (9) we have 


^Qa+bQa “ 


fiA + riB 


^{QAf + 




^QaQb)- 


The second term on the right vanishes because by hypothesis the 
samples from A and B are independent. 


and 


^QaQb 

^a^b 


the coefficient of correlation between q^^ and must be zero. 

Also ^QAf^Nal,y 

so that we have, finally, 

Nra^+B*^^ = N<^a > 


<^A 

^A'^^B ^A+B 
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But 


and similarly 








496 X *504 

956 

496 X -504 
1406 


(taking the observed value of ^ mean value of the ^’s for 

both towns). 

Hence the expression (8) becomes: 


7-496 X-504 


956"^ 1406 


-2 


956 

1406 


I 

‘956 




- / -496 X -504 X 


450 


956 X 1406 


= •0091 approx. 

The observed difference *525 — *496 = *029. 

As this is more than three times the standard error it is very probably 
significant. If the calculations had been carried out to a greater degree 
of accuracy it would in fact be found that the ratio of the difference of the 
g’s to the standard error of the difference is the same whether we com¬ 
pare with q^ or with either or 
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EXAMPLES 4 


I. A factory turns out an article by mass production methods. From 
past experience, which is sufficiently extensive to be reliable, it appears 
that 10 articles on the average are rejected out of every batch of loo. 
Find the standard deviation of the number of rejects in a batch. What 
is the approximate probability of 7 or more being rejected ? 

It is reported that several batches have recently been turned out con¬ 
taining 20 to 30 rejects. What inference would you draw? 


2. An examination is held in several different centres and the following 
data have been extracted from the results: 


Standard 

Mean deviation of 
Number of percentage percentage 
candidates of marks of marks 


Centre A 127 44*8 , 8*3 

All other centres combinec^ 2346 47*3 6-5 


Do you consider these data significant of any difference between the 
candidates at Centre A and at other centres and, if so, why ? 


3. An assurance company, a large proportion of whose business con¬ 
sists of 15- and 20-year term endowment assurances for sums assured of 
£50 and £i00y has in the past kept hand-written valuation cards on which 
net premiums are inserted to three places of decimals. These cards are 
filed according to year of birth for whole life assurances, and according to 
year of maturity for endowment assurances, and are arranged within each 
group according to date of entry. 

Machine-punched cards are now to be adopted and it is desired to save 
space on the card by tabulating the net premiums to the nearest integer. 
You are asked what the error is likely to be in the total of a number of net 
premiums and how you would proceed to construct samples for the 
purpose of testing whether any bias in the figures is likely to cause a 
divergence from your theoretic^ results. 

Investigate the problem and draft a short reply in non-technical 
language. 


4. An office has on its books 4000 policies of all classes subject to 
monthly premiums, the average sum assured being £300 per policy and 
deviations from the average being negligible in number and amount. The 
monthly premium for each policy is obtained by adding 2^ per cent to the 
annual premium rate per £100 sum assured (this rate being calculated 
to the nearest penny), dividing the amount so obtained by twelve, frac¬ 
tions of a penny counting as one penny, and multiplying by the number 
of £xoo’s assured. 
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Estimate the amount of loading in excess of 2 ^ per cent secured, 
indicating the statistical error involved. 

5. Under a group life assurance scheme the employees of a large 
industrial combine are on each ist January assured against death during 
the ensuing year for sums of ^£100, £200, £300, £400 or £500 according 
to their salaries on the ist January. The following schedule shows the 
number of employees assured on ist January 1937 —subdivided according 
to the nearest quinquennial age at that date, and, in the last column, the 
rate of mortality used in calculating the premiums: 


Age 

No. of employees assured for 

Total 

no. 

of em¬ 
ployees 

Total sum 
assured 


£100 

£200 

£300 

£400 

£500 

25 

2000 

1000 

300 

100 

— 

3.400 

530,000 

•003 

30 

1800 

1100 

500 

300 

100 

3,800 

720,000 

•004 

35 

1600 

1100 

700 

400 

200 

4,000 

850,000 

•005 

40 

1300 

1000 

800 

500 

300 

3»900 

920,000 

•006 

45 

1000 

900 

700 

600 

400 

3,600 

930,000 

•008 

50 

700 

700 

600 

500 

400 

2,900 

790,000 

•012 

55 

300 

400 

400 

400 

500 

2,000 

640,000 

•020 

Total 

8700. 

6200 

4000 

2800 

1900 

23,600 

00 

0 

8 

0 

— 


The death claims arising during 1937 amounted to £51,000. 

To what extent do you consider that the difference between the mor¬ 
tality experienced during 1937 and that used in calculating the premiums 
was significant? 

(You may ignore any correction for the grouping according to nearest 
quinquennial age.) 

6. {a) An office investigating its mortality experience among lives 
aged X exactly observes that, among lives, deaths during the year of 
age give a rate of mortality ip 

Assuming that the true rate of mortality is known, what deviation 
would you consider significant? Give reasons. 

State the test for significance of the difference between the observed 
deaths 6 ^ and those expected by the true mortality 6 ^ [—E^x and 
give a convenient practical approximation that you would make if were 
in the neighbourhood of *01. 

(6) .The office, for the sake of convenience, would prefer to enumerate 
the number of policies on lives aged x and the number of such policies 
becoming claims, but it is suggested that the normal tests for significance 

PMAsiii 8 
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vrould be invalidated by the existence of lives on which two or more 
policies are in force. 

On what grounds is this criticism based? Illustrate your answer by 
assuming: 

(1) That there are two policies in force on each of the lives. 

(2) That one-half of the lives have one policy each and the other 
half two policies. For this purpose you may assume that the rate of 
mortality is the same among lives having one policy and lives having 
two policies. 

7. A specified universe consists of N measurements and its standard 
deviation is A group of n measurements is selected at random and its 
standard deviation is obtained as Show that the mean value of 
when many such samples are taken, approximates to 

n-i N 



CHAPTER V 


GRADUATION. GENERAL CONSIDERA¬ 
TIONS AND TESTS 

1. Before we proceed to the description of individual methods of 
graduation there are several matters which can conveniently be 
dealt with as they arise in the application of several, or even all, 
methods. 

2. As has been said previously, any body of data examined by the 
actuary should be regarded as a sample (even though all the avail¬ 
able data have been included in the investigation) and hence any 
results deduced will be subject to errors of sampling. 

By speaking of parameter values'of, say, or sickness rates z^y 
we mean the ideal values which would have been obtained had 
there been unlimited data available (and the machinery for handling 
them) and had the years considered been themselves free from 
any accidental peculiarities. It is difficult to decide which of the 
peculiarities can properly be regarded as accidental and on this point 
opinions may differ. For instance, a severe influenza epidemic may 
render a given year abnormal, but the effect of epidemics in general 
should be allowed for in the universe values. Only the intensity of 
the attack may be regarded as abnormal. Again, while the years 
1914-18 were abnormal it does not necessarily follow that, in con¬ 
sidering our picture of the universe rates of mortality, sickness, 
fertility, etc., such upheavals as occurred in those years should be 
ignored. 

3. The statistical rationale of graduation. 

It may be said that the art of graduation is to arrive at an estimate 
of the true or universe values from the values derived from a par¬ 
ticular investigation. For the purpose of most of the tests discussed 
in this chapter it will be assumed that the data examined are a 
randpm sample. In actual practice other considerations arise, for, 
in order to formulate estimates for the future, the actuary records 
and analyses data which must of necessity relate to the past. The 
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graduated rates may not therefore indicate what results would have 
been obtained but for the limitations of sampling. The best example 
of this is given by the a(m), a{f) tables, which were derived from 
an investigation covering the years 1900-20. The graduated table 
finally adopted did not reflect the actual experience of those years. 

There are other factors which sometimes apply, such as a desire 
to err on the side of safety either over the whole table or over a 
particular range; for the purpose of this chapter, however, it will 
be assumed that the graduated series of values represents an 
estimate of true or universe values. We shall develop tests to be^ 
applied on the assumption that the data used in the investigation 
represent a random sample, i.e. that no bias has been introduced 
accidentally or deliberately. 

4. Properties of a well-graduated series. 

In the language of the previous chapter we shall set up as a 
hypothesis that the graduated rates are the “true” rates and that 
the observed rates differ from them only owing to sampling errors. 

There is a saying “natura non agit per saltum” expressing the 
fundamental fact that natural forces operate gradually and that their 
effects become apparent continuously and not in sudden jerks. In 
its application to mortality and sickness data it implies that any 
rates which may reflect the operation of purely natural causes should 
not exhibit any discontinuities, breaks, or sudden and unexpected 
changes. In other words, we expect any set of true values to follow 
a smooth curve, or, as we usually say, the graduated series must 
possess a high degree of smoothness. 

We know the sampling distribution of functions such as q^\ this 
is only a proportionate class frequency, and if we assume that its 
distribution is roughly normal we can form some idea of the 
probabilities of errors of various sizes arising from the operation 
of random sampling. 

Taking the graduated values as true values, we can calculate the 
discrepancies between the true values and the sample values (i.e. those 
revealed by the data) and examine whethfer they are reasonable or not. 

We thus have two main sets of tests: (i) tests for smoothness and 
(ii) tests of fidelity to data. 



TESTS FOR SMOOTHNESS II7 

5. Tests for smoothness. 

For practical purposes any table which is to be extensively used 
should have a very high degree of smoothness: otherwise the more 
complicated functions based on it, such as policy values, will show 
alarming and even embarrassing irregularities. 

It is usually found desirable therefore to examine the first three 
orders of differences of the graduated values. Generally speaking 
the third order of differences will be very small and it has been 
suggested that, in comparing two different graduations for smooth¬ 
ness, the sum of the third differences should be found for each 
table, the summation including all the values. On this basis it is 
usual to accept, as the better graduation, that which gives rise to the 
smaller total. On the other hand, it should be remembered that 
smoothness in the successive orders of differences is more important 
than smallness. It is quite easy in fapt to write down any number of 
ideally smooth series derived from mathematical formulae for which 
the successive orders of differences, although smooth, show little 
or no tendency to diminish. 

It should be remembered, moreover, that although natural 
causes are unlikely to produce irregularities, other factors may be 
operating to do so. Irregularities which they produce are often 
inherent in the data and no attempt should then be made to eradicate 
or reduce them. For instance, many pension schemes provide for 
retirement between ages 60 and 65 or before age 60 for reasons of 
ill health. The retirement rates for such a scheme would probably 
show a steady increase up to age 60 but a sudden jump when that 
age was reached. There might well be two peaks, one at age 60 
(the first age for normal retirement) and the other at age 65, with 
a trough in between. These discontinuities, or sudden changes of 
curvature, are due to the operation of the rules, and unless there is 
good evidence that special circumstances, unlikely to recur, have 
exaggerated them no attempt should be made to reduce them so as 
to produce a more regular curve. 

6. ^^Errors” and ^^mistakes”. 

As hitherto, we shall use the word “error” to indicate the dis¬ 
crepancy between a parameter and the corresponding value derived 
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from a random sample. The error is solely due to the smallness of 
the sample. Unfortunately, bther factors such as human fallibility 
have to be reckoned with, and we shall use the word “mistakes’* to 
refer to inaccuracies or discrepancies between universe and sample 
values due to causes other than random sampling. 

It may be said at once that there is no satisfactory way of dealing 
with mistakes except their complete eradication before graduation 
commences. Some of the tests described later may draw attention 
to mistakes and all methods of graduation do in fact reduce their 
disturbing effect. If mistakes can be traced the whole graduation 
should be done again after they have been corrected. Unfortunately 
this is often impracticable. 

7. Bias. 

As has been previously stated, bias does not introduce errors in 
the statistical sense and the foregoing theory does not apply. It may 
arise through some personal factor, such as misstatement of age, 
and sometimes the statistical processes may themselves introduce 
bias. 

If bias cannot be eliminated, the method of graduation should be 
chosen so as to reduce the effect to a minimum. The method used 
for the English Life Tables in Chapter X is an excellent example. 

If bias is at all extensive the ordinary tests fail, except that the 
examination of the deviations and accumulated deviation is still 
very valuable (see para. ii). 

8. Special objects of the investigation. 

In deciding upon a method of graduation, and in testing the 
results, the object of the investigation should be borne in mind. For 
instance, if a mortality table is required for use in ordinary life 
assurance it is important that the mortality should not be under¬ 
estimated, while if it is to be used for calculating annuity purchase 
money the converse holds. For valuation purposes the gradient 
should not be underestimated over the important range of ages 40 
to 70, since net premium reserves generally vary with the gradient 
of the mortality curve rather than with the lightness or heaviness 
of the rates. 
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9. Choice of the function to be operated upon. 

The rough data derived from the investigation are usually avail¬ 
able in the form of the exposed to risk at each age or group of ages 
and the corresponding decrements (deaths, retirements, marriages, 
etc.). 

We may either 

(i) attempt to graduate the exposed to risk and the decrements 
separately, finally deriving the rates of decrement by division; 
or (ii) obtain ungraduated rates of decrement direct from the rough 
data and then graduate them. 

The first method has certain advantages in special problems, 
notably those connected with the English Life Tables. Above all it 
enables the operator to keep in view the weight of the data at each 
age while he is performing the graduation. 

If the second method is used, one rate of decrement is very much 
like another, irrespective of the volume of the data on which it is 
based; and although in testing the graduation the weight of the 
observations is allowed for this is not done during the actual process 
of graduation. 

On the other hand the following considerations should be borne 
in mind: 

(а) in many experiences the exposed to risk is essentially a dis¬ 
continuous function; and 

(б) a slight distortion of the exposed to risk may coincide with a 
slight distortion of the decrement in the opposite direction 
and the combined effect may be quite appreciable. Moreover, 
if the rate of decrement is increasing slowly, a slight dis¬ 
tortion of the decrements may produce a graduated rate 
decreasing with an increase in age over a range where such a 
feature is unlikely to represent the facts. 

As regards (b) it is not uncommon to graduate the exposed to 
risk and then to adjust the decrements so as to produce the same 
crude rates as before. The adjusted values are then graduated and 
serious distortion of the resulting rates is unlikely. The extra work 
involved is usually well worth while. 

On the whole the method (ii) above, viz. the calculation of crude 



120 GRADUATION 

rates of decrement which are then graduated is almost invariably 
adopted. By this means the work is reduced to about one-half of 
that involved in method (i). Moreover, a quotient such as a rate of 
decrement is much more likely to progress smoothly from age to 
age and to follow approximately a mathematical curve than a 
function such as the exposed to risk. 

In considering mortality tables the rate of mortality is usually 
chosen for graduation, but the form of the curve with its gentle 
gradient at the younger ages and its very rapidly increasing gradient 
at higher ages makes it unsuitable for some purposes. 

For this reason log^,, and log(}j,+*1) have sometimes been used, 
since these functions tend to be much flatter and can therefore be 
represented more easily in the graphic form. 

Again, the reader will be familiar with tables in which /Xj, is of the 
form Because of the practical advantages of such a table 

many attempts have been made to find satisfactory graduations of 
jUf or colog p, (a closely allied function) using Makeham’s Law or 
some modification of it, such zs A+Bx+Dd^, 


10. Comparison of graduation methods. 

In graduation, more than in any other branch of actuarial science, 
it may be said that “the proof of the pudding is in the eating”. No 
method, however unorthodox or crude it may appear, should be 
condemned in its application to a particular set of data provided 
that the results produced are satisfactory. It may be felt that the 
method is unlikely to be equally successful if used for other tables; 
that is, however, no criticism of its adoption for the particular table 
considered. 

It will be seen later that each of the well-known methods has 
peculiarities which make it likely to be valuable in spec^ circum¬ 
stances. For example, the graphic method is especially useful if 
the data are scanty. The experimenter should not, however, be 
deterred from tiying any process which appears likely to suit the 
particular problem in hand. The important point to remember is 
that the graduated table should satisfy the two essentials of smooth¬ 
ness and adherence to data. 
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11. Deviation and accumulated deviation. 

In discussing tests for adherence to data it will be assumed for 
clearness that we are dealing with rates of mortality. Most of the 
results, however, apply to any tables of rates which can be re¬ 
garded as proportionate frequencies (e.g. rates of withdrawal or 
marriage); they do not apply to sickness rates, which allow for 
duration of incapacity and are not therefore related to frequencies 
of occurrence. 

In order to allow for the weight of the data at each age it is usual 
to compare actual frequencies and not proportionate frequencies. 
Thus the ungraduated rate of morality is not usually compared 
with the graduated rate ; both are weighted by the exposed to risk 
Ej. and the actual deaths are then compared with the value of E^qx- 
E^q^. is definedas ‘ ‘ the expected deaths at age x'\ In terms of statistics 
the Ej. exposed in the .sample are imagined as divided into two 
classes, those dying before age jc+1 and those surviving to that age. 
The observed class frequency in the “deaths” category is 0 *, while 
the true value which would have resulted if the universe rate of 
mortality had applied is E^qx- 

The expression “ actual deaths minus expected deaths” is usually 
referred to as the “ deviation ” and will often be denoted by dx—^x9x 
or, more briefly, hy A—E. 

Table VI on pp. 122, 123 is typical of a graduation based on 
scanty data and shows how most of the functions needed in testing 
it are calculated. The functions “accumulated deviation” and 
“approximate standard error” will be discussed later. 

The columns for and A^q^, are inspected for smoothness, but 
on this occasion A^qx has not been found, since only three significant 
figures of qx are available and A^qx is clearly smooth. 

In column (8) the deviation is shown on the left if it is negative 
and on the right if it is positive. Allowing for sign the total of the 
column is *3, which checks the numerical work, since the total 
actual deaths are 373 (column (3)) while the total expected deaths 
are 3727 (column (7)). The sum of the actual deaths should always 
be nearly equal to the sum of the expected deaths, i.e. the total 
deviation allowing for sign should be approximately zero. 
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' If the curve representing the graduated values closely followed 
the curve representing the observed values we should expect the 
two curves to cross and re-cross at frequent intervals. In other 
words we should expect the deviation to change sign frequently. 
In the table above, the deviation changes sign ten times in twenty- 
one values, and this must be considered very satisfactory. 

The mere crossing and re-crossing of the curves is not enough, 
however, since large positive deviations at one point may be 
followed by only small negative deviations. To investigate this point 
the column headed “accumulated deviation** is formed by adding 
the previous column from the top downwards, i.e. the second entry 
is the sum of the first two in column (8), the third is the sum of the 
first three in column (8), and finally the last is the sum of the whole 
of column (8): this again checks the numerical work. 

Clearly any item in column (9) represents the difference between 
the total actual deaths up to that age and the total corresponding 
expected deaths. The figures, therefore, should never be large and 
the same applies to their total, while the sign should change fairly 
frequently. In the table shown the total of column (9), allowing for 
sign, is 6*5 and there are nine changes of sign; this indicates satis¬ 
factory agreement with the data. 

Most of the tests of a graduation relate to the size of the 
deviations. Before we leave the question of changes of sign, 
however, the following section may be of interest. 

12. Changes of sign in the deviation and accumulated deviation. 

If the graduated rates are regarded as parameters of the universe 
from which the given data were obtained by random sampling, any 
given deviation is equally likely to be positive or negative although 
strictly this assumes that the graduated value is a median rather 
than a mean. 

If the signs are random the number of changes of sign in the 
colunm of deviations should be roughly equal to the number of 
non-changes; the same applies to the signs in the column of 
accumulated deviations. 

This forms the basis of a test suggested by H. W. Haycocks 
in the discussion which followed the reading of H. L. Seal’s paper, 
“Tests of a Mortality Table Graduation**. 
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In that paper two more refined tests for changes of sign in the 
deviations were discussed. The first of these was devised by Make- 
ham {J.I.A. Vol. xxviii), and although the test is rather inelastic, 
the paper in which it appeared forms a useful introduction to a 
more modem paper by W. L. Stevens, “ Distribution of groups in 
a sequence of alternatives”, Ann. Eugen., Lond.y Vol. ix, p. lo. 
Seal showed how Stevens’s technique could be applied to changes 
of sign in a series of deviations, but Stevens’s original paper must 
be studied by anyone who wishes to make use of the method. 

13. Standard error of a deviation. 

We know that if a particular class had a proportionate frequency 
q in the universe the class frequency in random samples of n will 
have a binomial frequency distribution with mean nq and standard 
error 4 npq* If q is the rate of mortality q^ and the number in the 
sample is this means that the expected deaths would form a 
binomial frequency distribution with mean E^q^, and standard error 
^ExPx^xi where q^ is the tme rate of mortality, which we may take 
as the graduated rate. In practical workis usually so near to unity 
that the standard error is taken as V^'xJaj—Vexpected deaths. This 
function is tabulated in the last column of Table VI and gives a 
means of testing the size of the deviations. 

Since the expected deaths E^qx are the mean of the sampling 
distribution the deviation is merely a sampling error, and if we 
apply the results derived from the normal curve we can say that the 
deviation should not exceed twice the standard error, or at any rate, 
three times the standard error. Accordingly, we compare each 
deviation with the value in the last column, regardless of sign, and 
estimate roughly the probability of its arising through sampling 
errors. In Table VI only four deviations exceed the standard error— 
an excellent result which is probably fortuitous. 

Although we have so far compared each individual deviation with 
its approximate standard error more information is often obtained 
by considering several values together. For instance, if we find a 
succession of, say, three positive deviations, we can compare their 
sum with the sum of the standard errors; the same applies still more 
forcibly if we find a succession of four or five deviations of like sign. 
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Thus the last three deviations in Table VI total 5-0, while the corre¬ 
sponding standard errors total approximately 22*1, about four 
times as much. 

Similarly, we may group the data in order to obtain a group rate 
of mortality q\ 

, , _ Total deaths for group _ 

w ere q — “exposed” for group 

and compare the group deviation with 'sJ^LE^q'p* or, more simply, 
with 4 Q^E^ q\ This test is not strictly accurate, since the data at 
each age form separate samples with different sampling distribu¬ 
tions, although as a practical device it has its uses. 

It should be emphasized that, although deviations in excess of 
two or three times the standard error indicate distortion of the 
data, in graduation the converse does not hold. Deviations can be, 
and often are, only a small fraction of the standard error in good 
graduations, and should not then be regarded as evidence of under¬ 
graduation. Data are said to be undergraduated if the graduated 
curve has adhered too slavishly to the ungraduated values. 


14. Application of the probable error. 

It will be remembered that in the normal curve the probable error 
is approximately two-thirds of the standard deviation. Hence, if a 
sampling distribution is approximately normal, we may take the 
probable error as roughly f (standard error) and from the definition 
of probable error we should expect roughly half the observed 
deviations to fall short of this value and the rest to exceed it. Thus 
at age 61 in Table VI we may take the probable error as about 3*0, 
and if we took a great many samples each of 917 lives we should 
expect the deviation (irrespective of sign) to exceed 3 in about half 
of the samples. 

Actually, although we have only one sample at each age, the argu¬ 
ment leads us to expect about half the deviations to fall short of the 
respective probable errors. It must be borne in mind that the 
deviations are not part of one sampling distribution; each is a single 
representative of its own sampling distribution. In Table VI, 
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fifteen deviations are less than f Vexpected deaths, while only six 
exceed that value. This means that, on the whole, the deviations are 
less than we should expect and suggests that the data may be 
undergfaduated. This test alone would not cause the graduation 
to be condemned. 

15. Application of the mean deviation. 

Although comparison of deviations with the probable errors as 
above may draw attention to the possibility of undergraduation, a 
more satisfactory test is needed. For this, the mean deviation may 
be used. In a normal distribution the mean deviation is roughly 
four-fifths of the standard deviation. Hence, if we make the usual 
assumption that in suitable conditions a binomial distribution is 
approximately normal we can take the mean deviation (A — E) 
in a great many samples of lives aged x as 

• 8 \lE^p^q^ or, approximately, • 8 \Ie. 

In other words the deviation regardless of sign should on the 
average be *8 ^JE, We have only one sample of lives aged x and 
cannot expect the observed deviation to approximate to the 
average value. If, however, we sum the deviations regardless of 
sign for all the available ages the total should approximate to 
•SSVjE', because, although some deviations will be greater than 
their mean value and some less, these differences will tend to 
cancel out when a great many values are amalgamated. 

In Table VI the total of the deviations regardless of sign is 
i 6*4+ 167 = 33*1. Four-fifths of the total of the last column is 62*5, 
so that the total deviations are about half the expected value, thus 
tending to confirm our previous impression that the data have been 
undergraduated. 

It should be emphasized however that no one test is conclusive 
and that in any event it is unlikely that a graduation would be dis¬ 
carded merely because it seemed to adhere too closely to the data, 
provided the smoothness were satisfactory. In nearly all such 
inst^ces it will be found that the differences of the graduated rates 
progress irregularly and for this reason an attempt would be made 
to find a better graduation. 
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16. Limitations of the foregoing tests. 

The tests for adherence to data, so far discussed, may be sum¬ 
marized as follows: 

(1) The smallness of the individual values and of the totals in the 
columns showing the deviations and the accumulated devia¬ 
tions. 

(2) The number of changes of sign of the deviations and the 
accumulated deviations. 

(3) Comparison of each deviation with the standard error both 
age by age and by suitable age-groups. 

(4) Comparison of each deviation with its approximate probable 
error. 

(5) Comparison of the total of the deviations, irrespective of sign, 
with four-fifths of the total of the standard errors. 

Although for most practical purposes these tests are sufficient they 
are subject to the following limitations: 

(1) The smallness of the deviations and accumulated deviations 
is necessary for a good graduation; each of them may, how¬ 
ever, be very small indeed without giving any indication that 
the adherence to data is too good. The test can give evidence 
of distortion of the rough data but it gives no evidence of 
undergraduation. 

(2) It is difficult to say how many changes of sign are to be 
expected or how many are to be regarded as satisfactory. 
The problem often arises of how far the number of changes 
can differ from the number of non-changes before we may 
regard the graduation as suspect. 

(3) The deviations should bear reasonable ratios to the standard 
errors, although here again we have no evidence of under¬ 
graduation however small the ratios may be. 

(4) In theory this test reveals undergraduation as well as over¬ 
graduation (i.e. distortion), but it is rather insensitive. . 

(5) Evidence of undergraduation is obtained rather more reliably 
than by (4), but the mere addition of deviations irrespective 
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of sign is not really satisfactory. What is required is some 
means of finding a combined probability that the observed 
deviations would arise from random sampling. 

Two such tests are discussed in Seal’s paper and one of them, 
known as the test, is of very general application outside actuarial 
work. The theoretical work involved in the ^ test is rather difficult 
and beyond the scope of this book, but it is hoped that the 
following introductory notes will enable the student to read one of 
the standard textbooks on the subject more profitably after obtain¬ 
ing a preliminary grasp of the fundamental principles involved. 

17. The test. 

To combine deviations by straightforward addition is clearly 
unsatisfactory since positive and negative items will tend to cancel 
out. Test (s) discussed above avoids this difficulty by ignoring signs. 
Another obvious way is to square the deviations before adding. The 
student will by now be familiar with the idea of measuring dis¬ 
crepancies such as deviations in terms of their standard errors, as 
was done for instance in considering regression lines. It would seem 
logical therefore to divide each deviation by its standard error 
before squaring and adding, thus arriving at a function 

or more generally ^xPx 9 x 

actual value — expected value 
standard error 

This function is known as 

Clearly x^ will be small if the graduation adheres closely to the 
data and large if the deviations are large. By means of prepared 
tables it is possible to find the probability that the ^ actually found, 
or one even greater, is likely to arise from simple sampling. If 
this probability is small, i.e. if a value of x^ as great as or greater 
than the one observed is unlikely to occur, the graduation has 
departed too far from the data. If the probability is large we know 
that a greater value of x^ was very likely to occur, so that the small 
value actually obtained was probably due to causes other than 
sampling errors. The graduation has then adhered too closely to 
the rough data. 

FMAS iii 
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The prepared tables are very extensive, since one is required for 
each degree of freedom. The exact meaning of this term is explained 
in statistical textbooks, but the following remarks, though not 
strictly accurate in detail, may help the reader to grasp the under¬ 
lying ideas. Suppose that we have any body of data split into 
cells. For instance, the A 1924-29 data could be split into eight 
cells according to the following classifications: (i) with or without 
profits, (ii) medical or non-medical, (iii) whole-life or endowment 
assurance, or again the combined data could be regarded as split 
into cells, one for each age, or one for each quinquennial group of 
ages. The number in each cell is known as the cell frequency and 
X* is defined as 


s 


(actual cell frequency — expected cell freq uency)^ 
(standard error of cell frequency)^ 


» 


the summation extending over all the cells. 

The cell frequencies are rarely independent of each other: in 
particular the method of graduation involved often ensures for a 
mortality table that the total number of actual deaths and the total 
number of expected deaths shall be equal. Any relationship between 
cell frequencies is known as a constraint or, if the relationship is of 
the first degree, as a linear constraint. 

If there are n cells and k linear constraints the function/=« —A 
is called the number of degrees of freedom. 

Suppose that we took a great many samples of the same size with 
the same values of n and /s, and that we calculated x* for each. The 
resulting values could be grouped into a frequency distribution, 
namely the sampling distribution of x^. It should be noted that in 
applying the x® test the whole of the data are regarded as one 
sample, whereas previously we have regarded the data at each age 
as a sample. 

It can be shown that the sampling distribution of x^ follows 


the curve 




(I) 


where/is the number of degrees of freedom. 

The range of «is o to oo, so that the total area is 
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Substituting \x^t, this becomes 


/•oc 

:vo2*^Jo 




(Mathematics for Actuarial Students^ Part I, p. 148.) 
To make the total area unity we put 

_ I 

and write the equation to the curve of x®: 
where x represents 





The above figure shows roughly the form of the frequency curve 
if /> 2. It rises to a maximum where x (i.e. x^) = /~ 2 and falls 
away more gradually to the right as x increases. If/is large the curve 
is roughly symmetrical and it can be shown that V2x^ follows 
approximately a normal curve with mean V2/— i and unit standard 
deviation. Consequently for large values of / the tables prepared 
for the normal curve can be used if a new variable x' is taken, so that 

V2x* = V2JC = 

The values given in standard tables for degrees of freedom 40 and 
over have usually been obtained from the tables for the normal curve 
by means of this approximate method. 

If OL^Xq the shaded area to the right of the ordinate LL' is 
equal to i 

.(2) 


9-2 
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This expression for P, although somewhat involved, can be 
evaluated and extensive sets of tables have been prepared. They are 
more complicated than those based on the normal curve, since each 
value of /has to be treated separately. 

There are two main sets of tables: 

(1) Tables giving the value of P for certain values of x^. In 
Tables for Statisticians and Biometricians, Part I, these tables 
are given for each value of/from 2 to 29 and for higher values 
of/at somewhat greater intervals. 

(2) Tables giving the values of Xq for certain useful values of P. 
Table II in the Appendix is typical. It is reproduced from 
H. L. Seal’s paper, “Tests of a Mortality Table Graduation”, 
by kind permission of the author. 

Since the total area is unity. Pis clearly the probability that a value 
of X (i.e. a value of x^) obtained from a random sample will be greater 
than Xq. Having calculated for a sample, we can use tables in the 
first form to obtain the probability that a random sample would 
give rise to a value of x^ as great as or greater than the one obtained. 

Tables in the second form start with probability levels (see 
p. loi) and enable the values of x^ corresponding to thepi to be 
obtained. . . 

Thus from Table II in the Appendix we find that for a Mmple 
with 40 degrees of freedom the value of x^ corresponding t6 
probability level *01 is 62*88. If in testing a graduation wegtoupeid 
the data so as to give 40 degrees of freedom and obtained a value ^ 
X^ of, say, 65 we should infer that the data had been distorted becail 
the chance of so high a value as 65 arising from sampling errors is 
less than -oi. If on the other hand x* lay between, say, 28 and 50 
we should be satisfied with the graduation as regards fidelity to data 
since the probability P would be more reasonable, i.e. nearer J. 

18. Alternative hypotheses as to the graduated rates. 

In applying the x^ test it is important to distinguish between a 
hypothesis being tested by the data of a sample but based on other 
observations, and a hypothesis based to some extent on the sample 
by which it is being tested. 
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For instance, we might collect data showing the degrees of im¬ 
munization against colds produced by a certain treatment for men 
of different age-groups in England. If the resulting figures, after 
any necessary graduation, were applied to a sample of Scotsmen we 
could compare the actual and expected results and calculate 
If the figures were applied instead to the data on which they were 
based we should naturally expect a better fit and a smaller value 
of It is for this reason that, in such cases, a deduction is made 
from the number of cells in finding the degrees of freedom. 

A similar point arises in testing a mortality table graduation. 
We might assume that the graduated rates were the universe rates 
and would remain the same if we tested thousands of samples of 
the same size as the given data. We can imagine a value of x^ to be 
found from each sample and the results tabulated. 

A second hypothesis would be that a separate curve was fitted to 
each of the imaginary samples before the actual and expected results 
were compared and the values of x^ calculated. On this assumption 
a much better agreement would be obtained and the values of x* 
should be much smaller, corresponding to what we should expect 
with fewer degrees of freedom. 

In actual practice we have only one sample and one set of gradu¬ 
ated rates. Which hypothesis are we to adopt ? Are we to assume that 
the expected deaths are fixed unalterable values to be compared 
with the actual deaths in a great many samples of similar con¬ 
stitution, or are we to regard them as applying only to the data on 
which they are based? 

The practice in the past has been to adopt the first hypothesis, 
but modern statisticians, led by Prof. R. A. Fisher, favour the 
second and make an appropriate deduction in finding the number 
of degrees of freedom. 

The point will be referred to later. 

19. Example 1. 

The data of Table VI may be set out as on p. 134, after grouping 
where necessary to ensure an expected class frequency of about 10 as 
the minimum. 

The method of graduation did not impose any constraints, although 
the approximate equalising of the totals of actual and expected deaths 
suggests that one degree of freedom should be deducted so that the 
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number of degrees of freedom is the same as the number of cells, i.e. 14. 
Reference to tables shows that the probability of obtaining a value of 
X* greater than 2*45 is *99. Hence the low value obtained for is less 
than we should expect from sampling errors, thus confirming our previous 
conclusion that the table was undergraduated. 

Example 2. 

The following example is taken from Seal's paper in J,LA, Vol. LXXI 
supra and is based on data for ages 46} and 51^ included in the A1924-29 
mortality investigation. 
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Age 514 

Class of policy 

E 

e 

Eq 

e-Eq 

( e-Eqy 

Epq 

Whole life 






With profits 

Without profits 
Endowment assurance 

4''>444i 

9.814I 

404 

87 

347-86 

82-38 

56-14 

4*62 

9*137 

•261 

With profits 

Without profits 

137.610J 

27,607^ 

1124 

202 

”55-03 

231-73 

-31*03 

-29*73 

•841 

3-847 

Total 

216,476! 



— 

14*086 


Working on the hypothesis that the differences in the mortality in the 
various classes may be due merely to sampling errors, the rates of mor¬ 
tality are first calculated as follows: 


_ Total deaths ^ 1435 
Total exposed 248,776^ 


= *0057682, 


__ 1817 
216,4761 


•0083935. 


The expected frequencies in each cell {Eq) are then calculated and x* 
is found in the usual way, the results being 6*768 and 14*086 respectively. 
Owing to the way q was found the total expected deaths equal the total 
actual deaths, so that a linear constraint is introduced and the number of 
degrees of freedom is not four (the number of cells) but three. Entering 
the table for three degrees of freedom we find that the probability of 
obtaining a value of x* as great as 6*768 or greater is about *05, while for 
X*^ 14*086 the probability is about *003. 

For age 46J the probability *05 is small but not really significant. 
The value of *003 for age 51J is, however, so small that the value of x* 
obtained cannot be explained as being due to sampling errors. It seems 
therefore that the four classes of policy differ to a significant extent. Why 
this result was not shown at age 46^ is difficult to explain. This question 
is discussed in the original paper, to which reference should be made. 


20 . For convenience we have considered rates of mortality only. 
The tests described apply to other functions of the same type, e.g. 
proportionate frequencies or probabilities, and can easily be adapted 
for any functions which can be expressed approximately in prob¬ 
ability form. 
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In considering other functions, such as sickness rates, we can 
proceed in the normal way as far as obtaining the deviations and 
accumulated deviations and examining the results for changes of 
sign. Statistical tests for measuring the size of the deviations are, 
however, beyond the scope of this book. 

21. Limitation of statistical tests for adherence to data. 

The tests described for estimating the goodness or otherwise of 
the graduation of a proportionate frequency should not be applied 
too automatically nor should the results be interpreted too rigidly. 

The use of 20* or 3 <t as a significance test is based on the assump¬ 
tion that the sampling distribution is normal. This does not rest 
upon rigid theory when applied to a proportionate class frequency 
for which the deviations usually follow a skew curve representing 
the binomial expansion + The error involved should not be 
great if nq is large. 

The use of -So as a measure of the mean deviation is open to 
the same objection. Moreover, although with this approximate 
measure we can compare the total deviations irrespective of 
sign, we have no satisfactory criterion by which to interpret the 
difference. For instance, if the total irrespective of sign is 54-6, 
while *82 'Jnpq is 46-8, we cannot readily decide whether the 
difference of 7*8 is too great to be considered as attributable to 
sampling errors. Although an Italian actuary discovered the 
sampling distribution of the mean deviation in 1937 it is too 
complicated to be considered here. 

22. Comparison of two or more graduations* 

A method sometimes used for comparing the results of two or 
more graduations is to calculate for each graduation the sum of the 
squares of the deviations, a small sum indicating good adherence 
to data. 

This test is not very satisfactory. A comparison of the values of 
X*, which takes into account the standard error at each age, is to be 
preferred, and this method will probably be used more extensively 
in future. 
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It should be remembered, however, that the graduation with the 
smallest is not necessarily the best from a practical point of view. 
A simple graduation, easy to apply and having practical advantages 
(e.g. a Makeham graduation if joint-life functions are required), 
will usually be the best provided the value of x^ produced is satis¬ 
factory and the smoothness adequate. 
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EXAMPLES 5 

I. The rates of mortality observed among a body of lives have been 
graduated in three ways. The following table shows in quinary age-groups 
the actual numbers of deaths experienced and the expected numbers 
according to each graduation: 


Age-group 

Actual 

deaths 

Expected deaths by 

Graduation A 

Graduation B 

Graduation C 

20-24 

*5 

10 


*4 

25-29 

38 

21 


26 

30-34 

39 

41 


40 

35-39 

54 

64 


58 

40-44 

84 

85 


81 

45-49 

93 

107 

102 

99 

50-54 

97 

90 

93 

93 

55-59 

80 

75 

78 

78 

60-64 

55 

52 


54 

Total 

545 

545 


543 


Examine the graduations with regard to their fidelity to the experience 
(ignoring smoothness) and comment on their relative merits. 

2. Apply the x® test to the above data, assuming that the agreement 
between the actual deaths and those expected according to Graduation A 
is not fortuitous but that otherwise no constraints have been imposed. 

3. A small Life Office has examined its mortality experience over a 
recent period of time. The total actual deaths number 471, whilst the 
total expected according to the A1924-29 Table was 450. 

In respect of the Without Profit business the actual deaths were 31 
and the expected deaths 50. 

It is accordingly suggested that the Without Profit business must 
attract a much better class of life than the With Profit business. 

Criticize this suggestion and state carefully the assumptions under¬ 
lying any tests which you might consider it necessary to make. 

4. What is the justification for graduation of mortality statistics ? 

Apply the usual tests to the graduated rates of mortality given in the 

following table, which represents a section of a mortality experience. 
What special features do the graduated rates show? 
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Age X 

Exposed to 
risk E* 

Deaths 0^ 

I0‘ X 
= 10‘X9* 

Graduated 

10* 

50 

2000 

10 

500 

616 

s* 

2000 

14 

700 

658 

52 

1900 

6 

316 

704 

53 

1800 

17 

944 

753 

54 

1700 

9 

529 

805 

55 

1400 

7 

500 

861 

56 

1300 

5 

385 

921 

57 

1300 

12 

923 

985 

58 

1300 

18 

1385 

1055 

59 

1250 

24 

1920 

1138 

60 

1050 

8 

762 

1241 

61 

950 

15 

1579 

1371 

62 

950 

12 

1263 

1534 

63 

950 

20 

2105 

1736 

64 

900 

15' 

1667 

1980 

65 

850 

18 

2118 

2267 

66 

800 

27 

3375 

2597 

67 

800 

26 

3250 

2965 

68 

750 

34 

4533 

3361 

69 

750 

24 

3200 

3774 

70 

700 

35 

5000 

4196 

71 

700 

25 

3571 

4623 

72 

650 

37 

5692 

5065 

73 

650 

25 

3846 

5538 

74 

600 

41 

6833 

6058 

75 

550 

29 

5273 

6628 


5. Criticize the following graduation from the point of view of fidelity 
to the data: 


Age-group 




40-44 

>5.518 

65 

73-9 

45-49 

19,428 

144 

1346 

50-54 

21,594 

219 

223-9 

55-59 

21,890 

378 

346-3 

60-^4 

19.174 

465 

468-1 

65-69 

> 5.775 

557 

6oo-2 

70-74 

11.414 

68s 

675-5 

75-79 

6.993 

644 

637-4 

80-^4 

3.276 

47 > 

458-7 

85-89 

1,096 

217 

240-6 

90-94 

201 

67 

61-4 
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6. Criticize the following section of a graduation from the point of 
view of smoothness and test the adherence to the data: 



7. The following table shows the results of another attempt to graduate 
the data shown in Example 6. Calculate, for both sets of graduated values, 
any functions which enable {a) the smoothness and (^) the adherence to 
data of the two graduations to be compared. 
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8. The following table shows part of the Manchester Unity sickness 
experience 1893-97 for Occupation Groups A, H, J. 


Age 

No. of years 
of life exposed 
to risk of 
sickness 

(i) 

No. of 
sickness 
claims. 
First two 
years of 
sickness 

(2) 

Graduated 
proportion 
of members 
sick. First 
two years 
of sickness 

(3) 

No. of weeks 
of sickness 
claim. First 
three months 
of sickness 

(4) 

Graduated 
weeks of 
sickness 
per member 
per annum. 
First three 
months of 
sickness 

( 5 ) 

30 

70*903 

15.051 

•213 

45,742 

-649 

31 

69.389-5 

14,866 < 

•213 

46,185 

-656 

32 

67.639-5 

14.507 

•214 

45,034 

•663 

33 

66 , 397-5 

14,212 

•214 

44,661 

•670 

34 

64.043-5 

13.629 

•213 

42,848 

-677 

35 

61,632-5 

13.108 

•213 

42,432 

•685 

36 

59.473 

12,701 

► -214 

41.436 

-696 

37 

57.380-5 

12,406 

•215 

40,518 

-709 

38 

54.741-5 

11.777 

-217 

39.411 

•726 

39 

52,911 

11,642 

•220 

39.767 

-744 

40 

51.478 

”.434 

•222 

39.323 

-764 

41 

49.835-5 

”.378 

•226 

39,613 

-784 

42 

48,199 

10,929 

•227 

38,737 

•803 

43 

46,818-5 

10,737 

•230 

38,413 

-821 

44 

45.418 

10,563 

•232 

38,105 

-839 

45 

43.483 

10,168 

! -235 

37,115 

•858 

46 

41.654-5 

9,828 

*237 

36,675 

-879 

47 

40.328-5 

9.749 

•241 

36,521 

-904 

48 

39.107 

9.481 

•244 

35,735 

-932 

49 

37.723-5 

9.441 

•249 

36,304 

•962 

50 

36.510 

9.194 

*252 

37.251 

•994 

51 

35.237-5 

9,082 

•257 

35.873 

1-027 

52 

33.876-5 

8,827 

•260 

36,023 

i-o 6 i 

53 

32,727-5 

8,673 

•266 

36,218 

1-096 

54 

31.190-5 

8,362 

•271 

34.181 

1-134 

55 

29.664-5 

8,327 

•278 

35.306 

I-I 75 

56 

27.969-5 

7.949 

•284 

34.385 

i- 2 i 8 

57 

26,461-5 

7,726 

•292 

33,306 

1-262 

58 

24.377 

7.314 

.300 

32,169 

00 

0 

59 

22,872 

6,985 

-308 

30,178 

1-356 

60 

21,318 

6,756 

-315 

30,175 

1-409 
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The proportion of members sick, first two years of sickness, is obtained 
by dividing the figures in column (2) by the corresponding figures in 
column (i).* Column (3) shows the results of graduating the proportions 
thus obtained. 

The number of weeks of sickness per member per annum, first three 
months of sickness, is obtained by dividing the figures in column (4) by 
the corresponding figures in column (i). Column (5) shows the results 
of graduating the results. 

Criticize the two graduations and point out why tests applicable to 
one graduation are not applicable to the other. 



CHAPTER VI 


THE GRAPHIC METHOD 

1. The student of physics will be familiar with plotting on squared 
paper the results of experiments and subsequently drawing smooth 
curves through or near to the points in an effort to arrive at some, 
underlying law connecting the variables. 

The graphic method of graduation as used by actuaries is merely 
a more refined version of this process. It can conveniently be 
considered in three main stages. 

1. The rough data are grouped. 

2. The grouped data are represented graphically in some form 
or another and a smooth curve is drawn reproducing the 
general trend of the data, but not adhering too closely to local 
fluctuations. 

3. Values are read off from the smooth curve and subsequently 
adjusted and readjusted until they satisfy the two require¬ 
ments of smoothness and adherence to data. This last process 
is usually referred to as hand-polishing. 

There are two main ways in which the method can be used: it 
can be applied to the exposed to risk and decrements separately or 
the crude rates of decrement can be found from the data and 
subsequently graduated. 

2. Separate graduation of exposed to risk and decrements. 

As broad groupings may be necessary an obviously satisfactory 
method of representing the data graphically is by a histogram. It 
should be remembered that the rectangles of the histogram must have 
bases proportionate to the ranges of the separate groups, which will 
not always be equal. It is rather difficult to decide on the preliminary 
grouping, but as a general guide it may be said that sufficiently wide 
or “coarse*^ groups should be adopted to ensure a fairly regular 
outline for the histogram. The drawing of the smooth curve is a 
very difficult matter and unless the grouping is satisfactory it may 
become almost impossible. 

In drawing the smooth curve care should be taken that in each of 
the vertical strips of the original diagram the area of the rectangle 
of the histogram is roughly equal to the area bounded by the curve 
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and the ordinates forming the sides of the rectangle. The areas 
should not be exactly equal, as the process would then be merely a 
graphical interpolation and not a graduation at all. We know in 
fact from the previous chapters that the graduated values should 
differ from the ungraduated by amounts equal to reasonable 
sampling errors. The difficulty, therefore, of ensuring that the 
smooth curve departs sufficiently but not too much from the data 
as represented by the histogram is one of the chi^f drawbacks of 
this method. 

In reading off the graduated values from the smooth curve thus 
drawn it is desirable to tabulate them either for every age or, if this 
is impossible, for small groups of ages which do not correspond with 
the groupings adopted for the histogram. This is necessary because 
in testing for adherence to data the histogram groupings must not 
be used. The way in which the curve was drawn ensures reasonable 
agreement within those groups and we need to know whether the 
agreement is satisfactory over all ranges. 

The tests for smoothness of the graduated rates call for no special 
comment, and the student should remember to make an inspection 
of the deviations and accumulated deviation, preferably for 
individual ages. 

As mentioned in the previous chapter it is desirable to re¬ 
calculate the decrements after the exposed to risk figures have been 
graduated so as to retain the same rates of decrement as before. 
The adjusted decrements are then graduated in the same way. 

The following example will serve to illustrate the method: 
Example 1 . 

Graduate by a graphic process the deaths and marriages in the following 
table: 


Recorded age x 
(exact) 

Decrements between x and next 
recorded age shown in the first column 

Deaths 

Marriages 

l6 

io6 

979 

20 

50 

1281 

22 

64 

2069 

25 

81 

2033 

30 

68 

756 

35 

— 

— 
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It will be assumed that the exposed to risk figures have already been 
graduated and the decrements re-calculated as mentioned above. As the 
data are so scanty no further grouping seems possible or desirable. It 
should be noted that the ranges of the groups are unequal. 



In drawing the histogram the heights of the rectangles are obtained 
as follows: 

No. of deaths between ages 16 and 20= 106 = area of rectangle. 

Base of rectangle (range of group) = 4. 

Height of rectangle = = 26*5. 

Similarly, height of second rectangle = ^ = 25, and so on. 

The following values of the deaths at individual ages can then be 
derived from the curve (i.e. the areas of unit strips). It is impossible to 
apply a satisfactory test for adherence to data in view of the way in which 
the figures are given. 


Age X 

16 

17 

18 

19 

20 

21 

22 

Deaths between 
age X and jc +1 

26-7 

26-5 

26-3 

26*0 

25-5 

247 

23-5 

Age* 

23 

24 

25 

26 

27 

28 

29 

Deaths between 
ageA:andjc+i 

21-8 

19-8 

i 8-2 

i 

i 6'9 

15-9 

15-3 

14*8 

Age* 

30 


32 

33 

34 


Deaths between 
age X and x + i 

14-4 

i 

13-6 

13-3 

1 


PMAsiii xo 











GRADUATION 


A rough comparison in the given groups is as follows: 


Ages 

Actual 

Graduated 

16-20 

106 

105*5 

20-22 

so 

50-2 

22-25 

64 

65-1 

25-30 

81 

8i>i 

30-35 

68 

68-3 

Total 

369 

370*2 


The marriages can be dealt with similarly. In this case it will be 
noticed that the histogram has quite a different shape, thus: 


The values derived from the curve are as follows: 


Age* 

16 

17 

Marriages between 
age X and +1 

140 

165 

Age* 

23 

24 

Marriages between 
age a; and 4^+1 

700 

675 
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It is almost impossible to test for smoothness and the comparison in 
the original groups is not a satisfactory test of adherence to data. The 
figures are: 


Ages 


Graduated 

16-20 

979 

985 

20-22 

1281 

1260 

22-25 

2069 

2065 

25-30 

2033 

2035 

30-35 

756 

755 

Total 

7118 

7100 


Because of the practical difficulties of replacing a histogram by a 
smooth curve it is often easier to deal with an ogive curve and represent 
the data by points. 

Thus the data in the above example might be written in the form: 


Age X 

Decrements occurring below age x 

Deaths 

Marriages 

16 

— 

— 

20 

106 

979 

22 

156 

2260 

25 

220 

4329 

30 

301 

6362 

35 

369 

7118 


In this form the data can be represented by points and graduated by 
the method explained in the next few paragraphs. The objection to the 
use of an ogive in this way is that the numbers tend to mount up very 
rapidly; this renders it difficult to find a suitable scale. The graduated 
values of the decrements are found by differencing. 

3. The Carlisle Table. 

Before we leave the histogram method of graduating by the 
graphic process it is appropriate to mention some of the principal 
features of the Carlisle Table. This was the first standard table 
constructed on sound lines. 

Previously the Northampton Table had been constructed by a 
Dr Price, who investigated the registers of the four parishes of 
Northampton. He concentrated on the parish of All Saints, where 


ZO-2 
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the records were most complete, and ignoring the.exposed to risk 
based a table on the recorded deaths during the years 1735 to 1780. 
In effect, as he thought that the population had been sufficiently 
stable for a long time, he took the average deaths as the column 
of a life table. The mortality, especially at the younger ages, was 
overstated, but the table was used for many years for calculating 
rates of annuity granted by the National Debt Office. 

Joshua Milne based his table on statistics for the town of Carlisle. 
His data related to the parishes of St Mary and St Cuthbert and 
consisted of an enumeration of the population in January 1780 and 
December 1787 and particulars of the deaths in the years 1779 to 
1787 inclusive. 

The exposed to risk and deaths were graduated by the method 
described in the previous paragraph and the resulting table was a 
great improvement on any previously available. The sexes were not 
dealt with separately and the large female population included 
caused the mortality to be unusually light. As mortality has steadily 
improved since Milne’s time this light mortality of the Carlisle 
Table prevented it from becoming out of date so soon as it would 
otherwise have done. 

The extensive sets of joint-life tables published by Milne are 
sometimes used for unusual rates of interest, although some ad¬ 
justment of the ages is necessary. 

4. Rates of decrement graduated graphically. 

When the rates of decrement are calculated from the rough data 
they can be represented graphically by points and the graduating 
curve merely has to pass near to these points. Such a curve is con¬ 
sequently much easier to draw than a curve which seeks to keep 
areas substantially unaltered. Moreover, a quotient such as a rate 
of decrement tends to progress more smoothly than a function such 
as the exposed to risk. In the graduation of rates of mortality, other 
standard tables may be available giving an indication of the general 
trend of This is particularly useful at the ends of the table, where 
the data are bound to be so scanty that the rates brought out are 
unreliable, and the curve has to be sketched in the light of previous 
knowledge of other tables. 
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5. Preliminary grouping. 

This is usually a difficult matter and calls for considerable 
experience and skill. As will be seen in Example 2 different group¬ 
ings may result in widely differing results. The following remarks 
are, however, of general application. Some authorities, such as 
G. F. Hardy, maintain that grouping with a fixed class-interval 
(e.g. quinquennial groups) gives the best results. The general 
opinion seems to be however that it is preferable to select groups 
in such a way that the group rates progress smoothly. This almost 
invariably involves the use of groups of unequal range; these can 
be found approximately as follows. 

(i) Plot the rough rates of decrement, without grouping, as a 
series of points and sketch in lightly a smooth curve representing 
the general run of the values. This may prove difficult, but refine¬ 
ment is unnecessary as a moderate degree of smoothness is quite 
sufficient. 

(ii) Now choose groupings in such a way that points above this 
guide curve are balanced by points below it; i.e. so that a point some 
distance above may be offset by grouping it with the next two, or 
even three, points lying slightly below the curve. The aim should be 
to ensure that the group rate will lie close to the general run of the 
values. Generally speaking, if the curvature seems to be changing 
rapidly, groups should be of short range; if the curve rises to a 
peak and falls again care should be taken to ensure that this peak is 
not cut off by the method of grouping adopted. When the ranges 
have been decided upon the guide curve should be erased. 

6. Plotting the data and sketching the curve. 

For each group the total deaths (or decrements) are divided by 
the total exposed to risk and the resulting rate is plotted as corre¬ 
sponding to the middle age of the range. As we have seen in Chapter 
I, para. 12, this is not strictly accurate, and formula (18) of that 
chapter should be used to find the deaths and exposed to risk 
corresponding to the central age of each group. 

On the other hand, such a refinement is not justifiable unless 
the fourth and higher differences are negligible, and the slight 
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systematic distortion involved when it is omitted is automatically 
corrected in the process of hand-polishing. 

The points representing the group rates should progress more 
regularly than those representing the original data and it should 
not be difficult to draw a smooth curve passing close to them. The 
importance of this step should not be underestimated, because, 
although it may be said that the final graduation takes place in 
the hand-polishing, this latter process is at the best of times a 
laborious business and may well become very lengthy unless a good 
curve has produced a set of values which are fairly satisfactory 
before any further adjustment is carried out. 

If it is found that an appreciable section of the table is un¬ 
satisfactory by, say, overstating the mortality, it will probably be 
quicker in the long run to re-draw that part of the curve and deduce 
fresh values rather than to attempt to adjust the discrepancies by 
hand-polishing. 

Inexperienced operators usually tend to adhere to the data too 
closely in sketching their curves and, as a result, the rates are 
undergraduated. 

The graduation discussed in Chapter V, Table VI, etc. was 
obtained by the graphic process and it will be remembered that 
there was evidence of undergraduation. 

7. Hand-polishing. 

The rates of decrement should be read from the curve for in¬ 
dividual ages, and the first two or three orders .of differences should 
be tabulated. Scrutiny of these reveals not only where the smooth¬ 
ness is unsatisfactory but also where points of inflexion occur. If 
the first differences are positive, a positive second difference shows 
that the gradient is increasing, while a negative second difference 
shows that the curve is becoming flatter. Consequently, a change 
in the sign of the second difference indicates a point of inflexion and 
this still applies if all the first differences are negative. The reader 
should satisfy himself as to the truth of this by drawing a few curves. 
Although points of inflexion are not unknown in mortality curves, 
particularly between the ages of 15 and 35 , they always require 
investigation and they can often be eliminated by a bolder drawing 
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of the curve; this will also improve the graduation generally. If 
data are grouped, divided differences can be used for testing for 
points of inflexion. 

For testing adherence to data the processes previously discussed 
should be applied and, though grouping is usually necessary because 
of scanty data, care should be taken to avoid the groupings adopted 
in drawing the curve. 

In the light of both these sets of tests, i.e. for smoothness and for 
adherence to data, the rates are adjusted and readjusted until a 
satisfactory table is obtained. If, for instance, there is a run of 
positive deviations at one section greater than could be accounted 
for by sampling errors the curve would appear to be too low and 
might with advantage be re-drawn. In practice the operator tends 
to concentrate in the first place on a good progression of second 
differences and devotes too little attention to the matter of adherence 
to data. 

8. Advantages of the method. 

Although not generally used for standard tables and therefore 
seldom discussed in actuarial literature the graphic method is more 
widely used than any other, with the possible exception of the 
method described in the next chapter. 

It is extremely adaptable and can be used for almost any function. 
Above all, it can give good results when the data are so scanty that 
other methods would be out of the question. This is its supreme 
advantage. 

It is commonly used in connection with pension funds and 
friendly societies for functions such as rates of withdrawal or 
retirement. In these cases each society is a law unto itself and it is 
usually impossible to use a standard table without considerable 
adjustment. 

The graphic method allows great scope for individual judgment, 
based very often on wide experience, and in this connection it should 
be pointed out that the ends of the table, which always cause 
difficulty because of scanty data, can usually be dealt with satis¬ 
factorily by sketching those portions of the curve in the light of 
knowledge gained from other tables of a similar type. 
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There is no reason why the graduation should not be adequate, 
since the hand-polishing is assumed to continue until the criteria 
for both smoothness and adherence to data have been satisfied. 

9. Disadvantages of the method. 

In actual practice the method is much more difficult to apply 
than it would seem to be and demands considerable skill and 
patience from the operator. 

It is unsuitable for standard tables based on extensive data, since 
a very high degree of smoothness is difficult to achieve. It is 
usually impossible to obtain sufficient places of decimals in the 
graduated rates because of the difficulty of reading more than 
three figures from a graph. 

In this connection it should be pointed out that, owing to diffi¬ 
culties of scale, it is usually necessary in graduating rates of mortality 
to draw the curve in two parts, namely, one curve up to age 65 or 
70 and another from age 60 or 65 to the end of life. When this is 
done it is desirable to make the curves overlap so as to ensure 
continuity. 

In Example 2 it will be noticed that the two curves used did not 
overlap but that the comparatively smooth progression of the 
ungraduated rates ensured a smooth junction. Nevertheless it is 
unwise to rely on such an uncertain result. 

By leaving scope for individual judgment the method also leaves 
scope for individual bias and prejudice, and it must be confessed that 
by means of the graphic method equally eminent and experienced 
workers might obtain widely differing results from the same data. 

Although a graphic process can be used to graduate the ultimate, 
portion of a mortality table it is not very satisfactory for dealing with 
select data. These are usually so scanty that it is impossible to form 
an idea of the trend until they have been grouped in quinquennial 
or decennial groups of ages. 

If the group rates are plotted on the same sheet as the curve 
representing the graduated ultimate rates, it is often possible to 
form an idea of how the select rates run into the ultimate rates, 
although the select rates themselves may be largely a matter of 
speculation. 
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10. Application of the test. 

It was pointed out in Chapter VI, para. 18, that two hypotheses 
were possible in applying the ^ test. If in testing a graphic gradua¬ 
tion we adopt the first, we need make no deduction in finding the 
number of degrees of freedom, which in this event would be the 
number of groups. 

The second hypothesis involves the fact that the curve was to 
some extent ‘‘forced” to fit the rough data; the necessary deduction 
that should be made on this account from the total number of groups 
is a difficult problem to decide and there may well be different 
opinions on the subject. A full discussion of this difficulty is outside 
the scope of this book. 

Example 2. Female Government Annuitants Table. 

The classic paper on the graphic method was given by Dr T. B. Sprague 
in 1886 {J.LA. Vol. XXVI, and reproduced in J.S.S, Special Number 
on Graduation) and related to the graduation of the mortality of Female 
Government Annuitants four years and upwards after purchase. 

The following results are taken from this paper. 

The rough data are shown in Table VII. 

As a first step the data up to age 51 were grouped in three ways, as 
shown in Tables VIII, IX, and X, 

Comparison of the rates of mortality in these three tables illustrates 
very clearly the need for experience and sound judgment in deciding on 
the grouping to be adopted. 

The grouping of Table VIII does not appear to be satisfactory since the 
rates produced are far from smooth. The rates in Tables IX and X show 
that either of the groupings adopted would be suitable. It will be seen, 
however, that the rates in these two tables suggest widely differing types 
of curve. From Table IX it might be inferred that the rates of mortality 
from ages 19 to 49 inclusive are approximately constant; in fact by 
amalgamating the data for the whole of this range this constant rate is 
found to be *0116, which does not differ to a significant extent from the 
rates brought out for each of the five groups in the table. 

An examination of the rates in Table X would seem to indicate that 

exceeds *0159 at age 19, falls to a minimum at an age in the group 
30-35 and thereafter increases steadily. 

In order to decide which of the sets of groupings was preferable 
Sprague investigated the data for each of the first four years* duration. 
As the features of each duration were very similar it will be sufficient to 
show the data for durations 0-3 combined (Table XI). 



Table VII. Data for Female Government Annmtants, 
4 years and over after purchase 


Age 

Exposed 
to risk 

Deaths 

Rate of 
mortality 

19 

9 

— 

— 

20 

14 

— 

— 

21 

18 

— 

— 

22 

23 

— 

— 

23 

31 

— 

— 

24 

37 

— 

— 

25 

54 

2 

•037 

26 

66 

2 

•030 

27 

85 

I 

•012 

^ 28 

99 

I 

•010 

29 

130 

3 

•023 

30 

163 

2 

•012 

31 

196 

— 

— 

32 

223 

2 

•009 

33 

248 

3 

•012 

34 

280 

3 

•on 

35 

319 

5 

•016 

36 

360 

5 

•014 

37 

416 

3 

•007 

38 

473 

6 

•013 

39 

571 

2 

•004 

40 

653 

13 

•020 

41 

733 

9 

•012 

42 

833 

12 

•014 

43 

941 

9 

•010 

44 

1097 

12 

•on 

45 

1252 

17 

•014 

46 

1416 

21 

•015 

47 

1578 

13 

•008 

48 

1761 

H 

•008 

49 

1958 

26 

•013 

50 

2160 

22 

•010 

SI 

2387 

37 

•016 

52 

2669 

43 

•016 

S 3 

2909 

SI 

•018 

S 4 

3330 

59 

•018 

S 5 

3682 

57 

•015 

56 

4104 

83 

•020 

57 

4473 

86 

•019 

58 

4885 

90 

•018 

59 

5281 

123 

•023 

60 

5644 

138 

•024 


Exposed 
to risk 


Deaths 
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Table VIII 


Ages 

Exposed 
to risk 

Deaths 

Rate of 
mortality 

19-26 

252 

4 

•0159 

27-29 

314 

5 

•0159 

30-33 

830 

7 

*0084 

34.35 

599 

8 

•0133 

36 ,37 

776 

8 

•0103 

38 

473 

6 

•0127 

39.40 

1224 

15 

•0123 

41.42 

1566 

21 

•0134 

43.44 

2038 

21 

•0103 

45-47 

4246 

51 

•0120 

48.49 

3719 

40 

•0108 

50,51 

4547 1 

- 59 

•0130 

19-51 

20584 

245 



Table IX 


Ages 

Exposed 
to risk 

Deaths 

Rate of 
mortality 

19-33 

1396 

16 

•0115 

34-37 

1375 

16 

•0117 

38-40 

1697 

21 

•0124 

41-44 

3604 

42 

•0117 

45-49 

7965 

91 

•0114 

50,51 

4547 

59 

•0130 


Table X 


Ages 

Exposed 
to risk 

19-29 

566 

30-35 

1429 

36-38 

1249 

39-44 

4828 

45-51 

12512 


Deaths 

Rate of 
mortality 

9 

•0159 

15 

•0105 

H 

•0112 

57 

•0118 

150 

•0120 
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Table XI 


Ages 

Exposed 
to risk 

Deaths 

Rate of 
mortality 

15-20 

102*5 

2 

-0195 

21-27 

501-6 

6 

•0120 

28-32 

675-3 

7 

•0104 

33-37 

1356-0 

10 

-0074 

38-42 

2548-3 

19 

-0075 

43-46 

3179-9 

35 

-0110 

47-49 

3196-0 

29 

-0091 

Total 

“.559-6 

108 

— 


The trend of these rates was not materially affected by other groupings 
tried and Sprague accordingly decided that the mortality did decrease up 
to say age 30. On this assumption the best of the three methods of 
grouping the ultimate data was the third, which he therefore adopted. 

For ages 52 to 70 the rates progressed more regularly and the choice 
of groups was much easier. The following groupings were used: 

Table XII 


Ages 

Exposed to 
risk 

Deaths 

Rate of 
mortality 

52-55 

12,590 

210 

•0167 

56-58 

13,462 

259 

•0192 

59-61 

16,963 

377 

•0222 

62 


157 

-0245 

63 


182 

-0269 

64 

7.247 


-0288 

65,66 

15.462 


-0333 

67 



-0393 

68 

8.197 


-0431 

69,70 

16,679 


-0443 

Total 

111.845 

3315 

— 


Finally, with certain exceptions, the data over age 70 were used for 
individual ages; the data were amalgamated for the following ages: 
81 and 82; 83 and 84; 88 and 89; 92 and 93; 95-97; and 98-102. 

Because of the difficulty of finding a suitable scale for the whole table 
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Sprague used two curves. He used one curve for ages up to 70 and a 
second curve for ages 71 and upwards. Although, as stated on p. 152, it 
is usually desirable to construct two such curves so that there is a slight 
overlap, Sprague succeeded in effecting a smooth junction without this 
precaution. 

The reader should plot the points representing the data, grouped as 
above, and should graduate them graphically, including the final hand¬ 
polishing. 

In order to enable the student to criticize his own work Sprague’s 
graduated rates are set out in Table XIII. 

It is a useful exercise to analyse these results, testing the smoothness by 
differencing and also testing adherence to data by calculating the devia¬ 
tions, accumulated deviations, square root of the expected deaths and 
finally by the methods described in the previous chapter. 


Table XIII. Sprague"sgraduated values ofq^ 

Female Government Annuitants. 4 years and upwards after purchase 


Age 


Age 

Qx 

Age 

Qx 

Age 

Qx 

19 

•0180 

40 

•0122 

61 

•0232 

82 

•1400 

20 

•0178 

4 * 

•0121 

62 

•0249 

83 

•1520 

21 

•0176 

42 

•0119 

63 

•0270 

84 

•1660 

22 

•0173 

43 

•0116 

64 

•0294 

85 

•1830 

23 

•0169 

44 

•0114 

65 

•0321 

86 

•2000 

24 

•0162 

45 

•0113 

66 

'O350 

87 

•2170 

25 

•0151 

46 

•0113 

67 

•0380 

88 

•2340 

26 

•0143 

47 

•0114 

68 

•0410 

89 

•2500 

27 

• 0*35 

48 

•0115 

69 

•0440 

90 

•2650 

28 

•0128 

49 

•0121 

70 

•0470 

9 * 

•2830 

29 

•0121 

50 

•0130 

7 * 

•0500 

92 

•3000 

30 

•0116 

5 * 

•0140 

72 

•0560 

93 

•3200 

31 

•0110 

52 

‘ 0 I 5 S 

73 

•0630 

94 

•3390 

32 

•OIOS 

53 

•0163 

74 

•0700 

95 

•3600 

33 

•0103 

54 

•0170 

75 

•0770 

96 

•3820 

34 

•0103 

55 

•0177 

76 

•0840 

97 

•4100 

35 

•0106 

56 

•0184 

77 

•0910 

98 

•4390 

36 

•0109 

57 

•0191 

78 

*0990 

99 

•4660 

37 

•0112 

58 

•0199 

79 

•1080 

100 

•5000 

38 

•oii 8 

59 

•0208 

80 

•1180 

lOI 

•S 4 S 0 

39 

•0120 

60 

•0219 1 

81 

•1290 

102 

•6050 
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Example 8. 

Table XIV. Graduation of marriage ratet 


Age 

Exposed 
to risk of 
marriage 

No. of 
mar¬ 
riages 

Rate 

of 

marriage 

Age 

Exposed 
to risk of 
marriage 

No. of 
mar¬ 
riages 

Rate of 
marriage 

20 

1100 

0 

— 

31 

660 

45 

•0750 

21 

1100 

6 

•0055 

32 

500 

40 

•0800 

22 

1100 

2 

•0018 

33 

500 

44 

•0880 

23 

1100 

13 

•0118 

34 

400 

35 

•0875 

24 

1000 

17 

•0170 

35 

400 

30 

•0750 

25 

1000 

18 

•0180 

36 

400 

25 

•0625 

26 

900 

36 

•0400 

37 

300 

21 

•0700 

27 

800 

34 

•0425 

38 

300 

15 

•0500 

28 

700 

49 

•0700 

39 

300 

21 

•0700 

29 

700 

49 

•0700 

40 

300 

15 

•0500 

3 ° 

600 

48 

•0800 

Total 

14100 

563 

— 


There appears to be no advantage in graduating the exposed to risk 
and marriages separately and we shall accordingly operate on the rate of 
marriage. 

Suitable groupings can be arrived at only after repeated trials, but a 
rough sketch of the curve suggests that the groupings shown in the 
following table should give good results. 

It should be borne in mind that these may not necessarily be the best 
groupings; there may be others which will give equally satisfactory results. 


Age range 

Central age 
of group 

Exposed 
to risk 

No. of 
marriages 

Rate of 
marriage 

20-22 

21 


8 

•0024 

23-24 

23J 

2,100 

30 

•0143 

25-26 

25 i 

1,900 

54 

•0284 

27-28 

27J 

1,500 

83 

•0553 

29 

29 

700 

49 

•0700 

30-3* 

30J 

1,200 

93 

•0775 

32-33 

32J 

1,000 

84 

■0840 

34-37 

35 i 

1,500 

III 

•0740 

38-40 

39 

900 

51 

•0567 

Total 

— 

14,100 

563 

— 


Sometimes, in a particular group, it may be desirable to use a weighted 
mean age instead of the central age if the numbers exposed to risk are 
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very unevenly distributed. In this example such refinement was un¬ 
necessary and the rates shown in the last column were plotted as corre¬ 
sponding to the central ages shown in the second column and a curve 
was drawn passing near these points. The position of the maximum 
ordinate causes difficulty in this particular graduation. 



The graduated values read from the curve are shown in the following 
table, only three decimal places being recorded (the fourth would be 
quite unreliable). The first three orders of differences are shown so that 
the smoothness can be examined. It should be noted that the two points 
of inflexion which from the general shape of the curve seem to be 
actual features of the experience to be retained in the final table. 
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The following table is also needed for testing adherence to data: 


Table XVI 


Age 

(I) 

Gradu¬ 
ated rate 

(2) 

Exposed 
to risk 

( 3 ) 

Expected 

marriages 

( 4 ) 

Actual 

mar¬ 

riages 

(s) 

Deviations 
( 5 )-( 4 ) 

(6) 

Accumu¬ 

lated 

deviations 

( 7 ) 

V( 4 ) 

approx. 

(8) 

20 

•001 

1,100 

!•! 

0 

- I-I 

— IT 

I 

21' 

•002 

1,100 

2-2 

6 

+ 3-8 

4 - 2-7 

I 

22 

•005 

1,100 

5-5 

2 

- 3-5 

- -8 

2 

23 

•010 

1,100 

Il-o 

13 

+ 2-0 

4 - 1*2 

3 

24 

•016 

1,000 

i6*o 

17 

4 * I-O 

4 - 2-2 

4 

25 

•024 

1,000 

24-0 

18 

- 6^0 

- 3-8 

5 

26 

•03s 

900 

31-5 

36 

+ 4 ‘S 

+ 7 

6 

27 

•050 

800 

40*0 

34 

- 6^0 

- 5-3 

6 

28 

•062 

700 

43*4 

49 

+ S-6 

+ -3 

7 

29 

•071 

700 

49*9 

49 

- -9 

- -6 

7 

30 

•078 

600 

46-8 

48 

4 - 1*2 

+ -6 

7 

31 

•082 

600 

49-2 

45 

~ 4*2 

- 3-6 

7 

32 

•084 

500 

42^0 

40 

- 2^0 

- S-6 

6 

33 

•083 

500 

4 I-S 

44 

+ 2*5 

- 3-1 

6 

.34 

•080 

400 

32-0 

35 

+ 3*0 

- -I 

6 

35 

•076 

400 

30*4 

30 

“ ‘4 

- -5 

5 

36 

•071 

400 

28-4 

25 

“ 3-4 

- 3-9 

5 

37 

•066 

300 

19-8 

21 

+ 1-2 

: - 2-7 

4 

38 

•061 

300 

i 8 - 3 , 

15 

- 3'3 

- 6-0 

4 

39 

•057 

300 

17-1 

21 

+ 3*9 

- 2-1 

4 

40 

•053 

300 

15-9 

15 

- .9 

- 3-0 

4 

Total 

— 

14,100 

566-0 

563 

•+28-7 

-317 

-f 7-7 
-42-2 

100 


>/ Expected marriages has been taken as an approximation to t he stan- 
dard error of the deviation, but the more accurate expression 
would not materially affect the results. It is seen that the deviations change 
sign 16 times as compared with only 4 continuations of sign; this is not 
unsatisfactory, although the accumulated deviations are negative above 
age 30. 

The individual deviations never exceed twice the standard error except 
at age 21, where the number of marriages is so small as to make the test 
unreliable. We do not get any runs of deviations with the same sign 
and of substantial amount which would need to be tested as a group, 
while the net deviation of -3 is satisfactory. 

PMAS Ui 


IX 
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The total o^the deviations irrespective of sign is 28*7 + '? 1*7 = 60*4, 
while *8 SV expected marriages is 80. These figures are sufficiently close 
to give no indication of under- or over-graduation. 

Looking at the table of differences we see that A® {mq)^ is *007, which 
is large as compared with the other third differences. In altering this it 
should be borne in mind that^he accumulated deviations above age 30 
are all negative, suggesting that the curve is rather too high around ages 

25-30. 

If we alter {rnq^’j from *050 to *048 and {tnq)^^ from *062 to *061 the 
run of the differences is improved and the largest third difference, namely, 
A®(wi^) 26, is now only —*003. What is perhaps more important is the 
surprisingly large effect which these slight modifications have on the 
accumulated deviations. 

The revised figures for ages 27 and 28 are as follows: 


Age 

Graduated 

Exposed 

Expected 

Actual 

mar- 

Deviations 

Accumu¬ 

lated 

rate 

to risk 

marriages 

riages 

(5)-(4) 

deviation 

(0 

(*) 

(3) 

(4) 

(s) 

(6) 

(7) 

27 

•048 

800 

38-4 

34 

- 4-4 

-37 

28 

•061 

_i 

700 

427 

49 

+ 6*3 

+ 2-6 


Thus the accumulated deviations from age 28 onwards are all 2*3 
greater than they were before and several changes of sign are thus intro¬ 
duced. The net deviation is now only *7, while the accumulated deviations 
add up to -f i8*2 —21*2, a result much nearer to zero than before. The 
total deviations irrespective of sign become 59*5. 

The graduation may now be considered satisfactory. 
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EXAMPLES 6 

I. A Life Office which does not grant surrender values until three 
complete years' premiums have been paid has obtained the following 
data from its experience of withdrawds (lapses and surrenders com¬ 
bined) under Whole Life Policies. 

Graduate the rates of withdrawal graphically, and write down the 
graduated values. Comment on any special points arising. 
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Curtate 

duration 

n 

Exposed to risk 
of withdrawal 

Withdrawals 

W . 

Rate of 
withdrawal 

0 

11,000 

209 

•019 

1 

10,000 

650 

•065 

2 

9,000 

1215 

•135 

3 

7,000 

679 

•097 

4 


352 

•064 

5 

4,000 

196 

•049 

6 

3,000 

102 

•034 

7 

2,500 

92 

•037 

8 

2,000 

43 

•022 

9 

1,600 

56 

*035 

10 

1,200 

36 

•030 

II 

1,000 

21 

•021 

: 

: 

: 

; 

• 

• 

• • 

• 


2. Given the following particulars, find interpolated values for 
durations o to 26 weeks by the graphic method: 


No. of weeks 
sickness since 
accident 

Under 

2 weeks 

2 but 
under 

3 weeks 

3 but 
under 

4 weeks 

4 but 
under 

5 weeks 

5 but 
under 
13 weeks 

13 but 
under 
26 weeks 

No. of cases 

470 

1970 

1510 

120 

203 

159 


3. Employ the graphic method of graduation to obtain from the data 
below the adjusted rates of mortality for ages 47-67. Apply all the tests 
you know to the results of your graduation. 


Age 

Exposed 
to risk 

Deaths 

10® 

Age 

Exposed 
to risk 

Deaths 


47 

166 

2 

1205 

58 

628 

II 

1752 


187 

2 

1070 

59 

701 

14 

1997 


218 

4 

183s 

60 

813 

18 

2214 


243 

6 

2469 

61 

917 

18 

1963 


276 

2 

7*5 

62 

1040 

24 

2308 


302 

4 

1325 

63 

1182 

30 

2538 


347 

7 

2017 

64 

1299 

43 

3310 

54 


3 

769 

65 

1432 

41 

2863 

Si 

430 

9 

2095 

66 

1596 

54 

3383 

s6 

494 

9 

1822 

67 

1752 

64 

3653 

57 


8 

1434 






11-2 
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4. A series of values of q,, which were obtained from deaths, and 
exposed to risk, is to be graduated by a graphic method. It has been 

suggested that if the two sets of points given by ± are plotted on 

A* 


the graph, then any smooth cui*ve lying entirely between these two sets 
of points will give a satisfactory graduation. Comment on the suggestion. 


5. Graduate the following experience by the graphic process. What 
limitations as regards (a) the graduation, (6) the tests of the graduation, 
are imposed by the way in which the data are given ? 


Age-group 

Exposed to 
risk of death 

Actual 

deaths 

Age-group 

Exposed to 
risk of death 

Actual 

deaths 

30-34 

15 

— 

70-74 

4.150 

195 

■ 35-39 

50 

— 

75-79 

3.568 

247 

40-44 

189 

2 

80—84 

2,516 

286 

45-49 

475 

8 

85-89 

1.284 

198 

50-54 

1,020 

9 

90-94 

365 

! Ill 

55-59 

2,032 

34 

95-99 

56 

23 

60-64 


54 

100 and over 

3 

2 

65-69 

4»203 

”3 







Total 

23,226 

1282 


6. Find, by the graphic process, graduated values of q^ from the 
following data: 

(i) by separate graduation of the exposed to risk and deaths, 

(ii) by graduating the group values of q^. 


Age-group 

Exposed 
to risk 
of death 

Actual 

deaths 

Age-group 

Exposed 
to risk 
of death 

Actual 

deaths 

30-34 

9 

— 

65-69 

829 

40 

' 35-39 

22 

— 

7074 

864 

51 

40-44 

24 

— 

75-79 


85 

45-49 

54 

— 

80-84 



50-54 

194 

9 

85-89 

217 

54 

55-59 

395 

10 

90-94 

49. 


60-64 

678 

27 

95 and over 

3 

2 




Total 

4622 

363 
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7. From the following data obtain by the graphic method graduated 
values of q^. for ages 45 to 60 inclusive, and apply the usual tests to your 
graduation: 


Age X 


9. 

Qx 

Age X 

E, 

e. 

Qx 

2 I- 2 S 

20 

1 

•0500 

46 

100 

2 

•0260 

26 

10 

— 

— 

47 

no 

3 

• 0273 \ 

27 

20 

— 

— 

48 

no 

I 

•0091 ^ 

28 

20 

— 

— 

49 

100 

3 

•0300 

29 

30 

I 

•0333 

SO 

100 

I 

*0100 

30 

40 

— 

— 

51 

100 

I 

•0100 

31 

40 

2 

•0500 

52 

100 

— 

— 

32 

40 

— 

— 

S 3 

100 

I 

•0100 

33 

50 

— 

— 

S 4 

100 

I 

•0100 

34 

so 

2 

•0400 

SS 

100 

2 

•0200 

35 

40 

— 

— 

S6 

100 

3 

•0300 

36 

60 

— 

— 

• S 7 

100 

2 

•0200 

37 

60 


— 

S8 

no 

3 

•0273 

38 

70 

— 

— 

S 9 

no 

— 

— 

39 

70 

I 

•0143 

60 

no 

3 

•0273 

40 

80 

— 

— 

61 

no 

3 

•0273 

41 

80 

1 

•0125 

62 

no 

2 

•0182 

42' 

90 

2 

•0222 

63 

no 

2 

•0182 

43 

90 

2 

•0222 

64 

100 

4 

•0400 

44 

90 

— 

— 

6s 

100 

3 ' 

' *0300 

45 

100 

I 

•0100 

66-70 

500 

20 

•0400 




CHAPTER VII 


GRADUATION BY REFERENCE 
TO A STANDARD TABLE 

1. In this chapter we shall deal only with rates of mortality, but if 
a suitable standard table can be found for other rates of decrement 
(e.g. withdrawals or marriages) the method can be applied quite 
satisfactorily. It is rarely, however, that such a standard table exists. 

All mortality tables commence with a portion where dyjdx^ the 
slope of the curve, is small and finish with a portion where it is 
steep. A method which may be satisfactory for one part of the 
curve may be quite unsuitable for the other; for instance we have 
mentioned in the previous chapter that, in applying the graphic 
method, two curves drawn to different scales usually have to be 
employed. 

In order to produce a flatter curve than that given by y — qx> 
a function such as log(9a;+’i) is sometimes calculated from the 
data. This function is then graduated instead of 

The most satisfactory method for restricted data is, however, 
to use a standard table as a “base curveThere are many ways 
in which this can be done. One of the simplest is to calculate 
the ratios of the }*s derived from the data to the corresponding j’s 
of the standard table and to graduate these ratios. Clearly if a 
suitable table is taken for the base curve the ratio should not 
vary very greatly from unity. We are here in fact graduating 
where is the rate derived from the data and q^g. is obtained from 
the standard table. The ratio may quite well be graduated graphi¬ 
cally on one diagram. 

It is interesting to note that the first recorded instance of the use 
of a standard table was in a graduation by Griffith Davis. In this 
graduation the ratio was adjusted graphically {JJ.A. Vol. xi). 

2. Lidstone’s graphic method. 

In a paper in JJ.A. Vol. xxx, G. J. Lidstone greatly improved 
on this method by dealing not with the ratio ql^jq^. but with 
where as before unaccented symbols refer to the 
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standard table. This function produces values which are not only 
smaller than q'Jqxy but which usually progress more smoothly. 

The following quotation from Lidstone’s paper is important: 
‘‘Considering first the mortality table to be used in calculating the 
expected deaths, it is evident that smoothness of graduation is most 
essential, more so, in fact, than close agreement with the observed 
rates of mortality, since any irregularities in the standard table 
would be reproduced and possibly exaggerated in our final results; 
for this reason a table following a mathematical law—as, for ex¬ 
ample, Makeham’s—^will generally be the most suitable to employ.” 

If + c (where c is a constant) all the values of will bear 

a constant ratio to the corresponding values of and the function 
will be a straight line parallel to the axis of x. The 
special case when both and follow Makeham’s law was also 
investigated, but most of this work is now of less interest than 
formerly because it is rarely that a modern experience can be 
graduated successfully by that law. 

3. Formulae methods. 

In order to obtain values which progress smoothly enough for a 
graphic graduation to be successful it is usually necessary to deal 
with functions such as/)^. and q^ rather than with the exposed to risk 
and deaths—although these are to be preferred on general grounds 
because they give effect to the weight of the data at successive ages 
or age-groups. 

For this reason it is usual to assume some algebraic relationship 
between, say, and q^ and to determine the constants in the 
relationship by reference to the exposed to risk and deaths. 

The following are a few of the formulae which might be used: 

1. = aq^ + fc. (If 6 = o this becomes q'Jqx = constant.) 

2. + 

3 - 9 x=qx{ax+b). 

4 * H'x ^ 

cS* where refers to one standard table and 

9 *1^ to a second. Usually the mortality rates to be graduated are 
intermediate between those of the two standard tables. 
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In these equations a, 6, K and n are constants. 

The following points arise in the application of the formulae. 

4. Formulae 1 and 2. 

If we assume for every age that q*j.=aq^-\-b, it follows that 
E^q'^=aE^q^-^-bE^, 

and = . 

Eg.q^ is merely the actual deaths observed at age a?, while Eg,qg, is 
the expected deaths according to the standard table. When these 
expected deaths have been calculated it is a simple matter to derive 
the equations (i), which can then be solved for a and b. The 
example later in this section will give the details of the calculations 
involved. 

When the data are grouped and E^ is not available for individual 
ages the expected deaths are usually calculated by using for the 
central age of the group; e.g. for a group of five ages 30-34 ^32 would 
be used, while for a group of four ages 30-33 ^311 would be used. The 
slight error thus introduced is not a serious matter when the data 
are scanty and sampling errors are therefore considerable. 

Similar remarks apply to Formula 2. 


5 * Formulas. 

There are several ways in which the constants a and b can be 
found. The following method is probably the best. 

Put?;/?, =y, so that the equation can be written: 

y=ax + b. .(2) 

We have to determine the constants so as to secure the best fit, 
having regard to the weight of the data at each age or in each age- 
group. 

This is precisety the problem which arises in fitting a line of 
regression to data when frequencies have to be allowed for; the same 
method can therefore be employed. 
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First tabulate jy for each value of x or for the central age x of each 
group and take as the corresponding frequency some convenient 
number roughly proportionate to the exposed to risk on which the 
value of y is based. (The actual values of the exposed to risk would 
make the work very laborious without achieving any material 
improvement in the fit.) 

Treating jy and x as correlated variables we can then find the line 
of regression, y = ax + b, and hence find for each age from the 
known value at that age. 

6. Formula 4. 

Unless Makeham’s law applies approximately it is impossible to 
find n except by trial and error. The method in which the formula 
can be applied and K determined^will be apparent when we con¬ 
sider Makeham’s law in Chapter IX. For the moment it is sufficient 
to note that the formula has practical advantages. 

It will be remembered that 

^IogZ),=-(fi^+8), 

SO that an addition of K to has the same effect on the values of 
dlogDJdx as the same addition to the force of interest 8. 

It follows that the values of and are the same if we add 

K to the force of interest as if we add K to the force of mortality. 

If, therefore, we find it possible to assume that 

Mi=/**+»+-^. 

many functions can be obtained from tables based on the standard 
mortality by adding n years to the age and by adding K to the force 
of interest. Annuity values can be derived in this way and the single 
and annual net premiums can then be deduced from Premium 
Conversion Tables using the true rate of interest and not the rate 
produced as a result of adding K to 8. 

In actual practice refinements are out of place in using a table 
based on scanty data and it is usual to add K to the rate of interest 
instead of to 8 and to round off the result to the nearest rate of 
interest for which functions are already tabulated. 
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7. Formulas. 

If the decremental rates of the special experience seem inter¬ 
mediate between those according to two standard tables this 
method can give good results. 

Since 

and . 

The summations extend over all ages, and once the expected deaths 
according to the two standard tables have been calculated the con¬ 
stants a and b can easily be found from equations (3). 

It should be pointed out that when the available constants are 
connected by linear equations such as (3), linear constraints are 
said to be imposed. In applying the test, therefore, it should be 
remembered that two degrees of freedom are lost. 

As an example we shall deal with a graduation by a slight modi¬ 
fication of Formula i. The method illustrates most of the points 
involved. 

8. Example. 

Graduate the following experience by means of the formula 

where « = 10 and the values of are taken from the table (Makeham 
Graduation). 6 ^ denotes the number of deaths between age x and x+i 
and Eg, is the corresponding exposed to risk. 


1 Age X 

E. 

9. 

Age X 

E, 

9. 

Age X 

E, 

9, 

0 

1000 

5 

40 

1600 

7 

5° 

1600 

12 

■9 

noo 

4 

41 

1700 

12 

51 

1500 

»3 

32 

1200 

s 

42 

1700 

5 

52 

1500 

14 

33 

1300 

6 

43 

1700 

H 

53 

1400 

12 

34 

1400 

6 

44 

i8oo 

12 

54 

1300 

II 

35 

1500 

8 

45 

1800 

12 

55 

1200 

13 

36 

1500 

6 

46 

1800 

H 

56 

1100 

8 

37 

1500 

8 

47 

1800 

H 

57 

1000 


38 

1600 

10 

48 

1700 

10 

S8 

900 


39 

1600 

7 

49 


13 

59 

800 








60 

.— , ,,J 

800 
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The necessary calculations are shown in the following table, column 
(4) of which shows the expected deaths according to the table. 
Column (3) is obtained by summing column (2) successively from the 
top downwards: i.e. the rth entry in column (3) is the sum of the first 
r entries of column (2). 


Table XVII 


Age 

(i) 


(3) 

ExQx—10 

(4) 

S(4) 

(S) 

9 . 

(6) 

29, 

(7) 

30 

1,000 

1,000 

57 

57 

5 

5 

31 

1,100 

2,100 

67 

12*4 

4 

9 

32 

1,200 

3»3oo 

77 

20*1 

5 

H 

33 

1,300 

4,600 

87 

28-8 

6 

20 

34 

1,400 

6,000 

97 

38-5 

6 

26 

35 

1,500 

7.500 

10*6 

49*1 

8 

34 

36 

1,500 

9,000 

10*8 

59-9 

6 

40 

37 

1,500 

10,500 

11*0 

70-9 

8 

48 

38 

1,600 

12,100 

II-9 

82-8 

10 

58 

39 

1,600 

13,700 

12*1 

94-9 

7- 

65 

40 

1,600 

iS>300 

12*3 

107*2 

7 

72 

41 

1,700 

17,000 

13-3 

120-5 

12 

84 

42 

1,700 

18,700 

137 

134-2 

5 

89 

43 

1,700 

20,400 

14*0 

148*2 

14 

103 

44 

1,800 

22,200 

151 

163-3 

12 

“5 

45 

1,800 

24,000 

'15-5 

178-8 

12 

127 

46 

1,800 

25,800 

15-9 

1947 

14 

141 

47 

1,800 

27,600 

16-4 

211-1 

14 

155 

48 

1,700 

29,300 

15-9 

227-0 

10 

165 

49 

1,600 

30,900 

15-5 

242-5 

13 

178 

50 

1,600 

32,500 

i6'0 

258-5 

12 

190 

51 

1,500 

34,000 

15-6 

274-1 

13 

203 

52 

1,500 

35.500 

i 6’2 

290-3 

14 

217 

53 

1,400 

36,900 

157 

306-0 

12 

229 

54 

1,300 

38,200 

152 

321-2 

11 

240 

55 

1,200 

39.400 

147 

335-9 

13 

253 

56 

1,100 

40,500 

141 

350-0 

8 

261 

57 

1,000 

41.500 

135 

363-5 

11 

272 

58 

900 

42,400 

127 

376-2 

9 

281 

59 

800 

43.200 1 

II -9 

388-1 

9 

290 

60 

800 

44,000 

12-6 

400-7 

8 

298 

Total 

44,000 

729,100 

4007 

M 

CO 

298 

4282 
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Similarly, column (5) is derived from column (4) and column (7) from 
column (6). The sums of columns (3), (5) and (7) then give the values 
of and required. 

Note that the sum of column (2) automatically checks the last entry in 
column (3) and similarly for the other columns. Columns (3), (5) and 
(7) could have been obtained by summing from the bottom upwards. 
The values of a and b would be the same whichever method were adopted. 

The equations corresponding to those numbered (i) on p. 168 are: 

298= 4007^ 4 - 44,000^1 
and 4282 ?=s855'ifl +729,looij * 

These give a = '8363 and ^ = - *00084. 

The graduated rates of mortality can be derived from the equation 

?*' = - 8363 ?»- io -•00084- 

The actual deaths and those expected according to the graduated table 
(not the standard table) should be compared; owing, however, to the 
small number of deaths involved any elaborate tests would be out of the 
question. There is no need to test for smoothness, since the standard 
table was graduated by a mathematical formula. 

9. Advantages of the method. 

The method of graduation by reference to a standard table is 
particularly valuable when the data are scanty, so that most other 
methods are out of the question. In such casesj even a graphic 
graduation would be largely guesswork. 

If the standard table is smooth (as it certainly should be) the 
results are satisfactory as far as smoothness is concerned and it is 
possible to concentrate on tests for adherence to data. 

Knowledge of other tables based on similar experience is auto¬ 
matically brought into use in the process of graduation. 

The method can be adapted to select tables, but with scanty data 
the select rates themselves are suspect, as the sampling errors are 
so great. 

The ends of the table cause little difficulty, but the reliability of 
the results may of course be doubtful. 

10. Disadvantages of the method. 

It is not always possible to find a suitable standard table, so that 
even if the constants in the graduation formula are chosen properly 
the adherence of the results to the rough data is not satisfactory. 
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EXAMPLES 7 

1. Show how a comparatively small mortality experience may be 
graduated by reference to a standard table: 

(a) by a graphic method; 

{b) by a formula so that the first and second summations of the actual 
deaths and the expected deaths are equal. 

Give any conditions necessary for, or restrictions to be imposed upon, 
the use of the methods stated. 

2. The values of given in the following schedule are to be graduated 
by the formula 

«X = « + *?X. 

where values of which are given in the last column of the schedule, 
is the rate of mortality according to a standard table. 

Obtain values for the constants a and b. 


Age X 

Exposed 
to risk 

Deaths 

Ungraduated 
rate of 
mortality 

Standard 
rate of 
mortality 9' 

80 

250 

35 

•140 

•134 

81 

200 

25 

•I 2 S 

•144 

82 

150 

22 

•147 

•>54 

83 

120 

21 

•17s 

•165 

84 

100 

18 

•180 

•176 

8s 

70 

15 

•214 

•188 

86 

50 

9 

•i8o 

•200 

87 

30 

10 

•333 

•213 

88 

20 

5 

•250 

•227 

89 

10 

5 

•500 

•242 
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3. Graduate the rates of mortality of Example 3 at the end of 
Chapter VI by reference to a suitable standard table, adjusting the 
ages if necessary. 


4. Obtain graduated rates of mortality from the following experience 
with reference to any standard table which you consider to be suitable. 


Age-group 

Exposed 
to risk 

Deaths 

Age-group 

Exposed 
to risk 

Deaths 

10-20 

77 

— 

51-55 

5891 

61 

21-25 

1472 

5 

56-60 

5415 

68 

26-30 

5449 

8 

61-65 

3907 

77 

3 I- 3 S 

8087 

16 

66-70 

1687 

41 

36-40 

7739 > 

19 

71-75 

451 

18 

41-45 

7111 

24 

76-80 

71 

6 

46-50 

6237 

37 

81-85 

I 

I 


5. Obtain graduated rates of mortality from the following experience 
by the formula 




of para. 7, using as the standard tables the A1924-29 and the A1924-29 
Light tables. 


Age-group 

Exposed 
to risk 

Deaths 

Age-group 

Exposed 
to risk 

Deaths 

10-20 

33 

— 

61-65 

43.244 

1084 

21-25 


2 

66-70 

34.013 

1349 

26-30 

5.312 

17 

71-75 

22,237 

1326 

31-35 

13.811 

31 

76-80 

12,516 

1240 

36-40 


63 

81-85 

5.243 


41-45 

28,403 

147 


1.944 

327 

46-50 

37.014 

242 

91-95 

415 

70 

51-55 

42,789 

442 

96-100 

78 

16 

56-60 

45.484 

731 

lOI- 

15 

2 
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6. The following data are derived from the Continuous Investigation 
into the Mortality of Assured Lives over the 5-year period 1934-38: 


Whole-Life with profit policies. Medical section. 
Duration 3 and over 


Age-group 

Exposed 
to risk 

Deaths 

Age-group 

Exposed 
to risk 

Deaths 

10-20 

no 

— 

61-65 

47.151 

I161 

21-25 


7 

66-70 

35.700 

1390 

26-30 

10,761 

25 

71-75 

22,688 

1344 

31-35 

21,898 

47 

76-80 

12,587 

1246 

36-40 

29.033 

82 

81-85 

5.244 

717 

41-45 

35.514 

171 

86-90 

1.944 

327 

46-50 

43.251 

279 

91-95 

415 

70 

51-55 

48,681 

503 

96-100 

78 

16 

56-60 

50,899 

799 

lOI- 

15 

2 


Taking the A1924-29 table as a standard, use a graphic method to 
graduate the values of 

9' (observed) 

^(Ai924-29V 


7. Using the data of the previous question, obtain graduated values of 
using the a{m) ult. table instead of the A1924-29. 

Comment on the results, bearing in mind {a) the relative smoothness, 
and {h) the suitability as regards similarity of the experiences of the two 
standard tables used. 






CHAPTER VIII 


GRADUATION BY A SUMMATION 
FORMULA 

1 . It is convenient in this chapter to regard any ungraduated value 
as consisting of two parts, the true or universe value and a 

superimposed error 
Thus 

€aj may be positive or negative and although we shall for the most 
part consider errors of sampling, the e*s will in practice contain 
inaccuracies as well, which will be dealt with by the formulae in 
exactly the same way as the sampling errors. 

An ideal graduation would of course eliminate all the c’s, but it 
will be seen that the most that can be attained by the use of a sum¬ 
mation formula is a reduction of the e’s and a smooth progression 
of the graduated values. Although we shall concentrate much of 
our attention on reduction of error and smoothness, it should be 
borne in mind that all the usual tests of adherence to data should be 
applied to the results of any graduation. This is a point the import¬ 
ance of which is sometimes overlooked because actuarial literature 
on the subject of summation formulae tends to deal with methods 
rather than results. 

» 

2. Running averages. Reduction of irregularities. 

In analysing a series of observations which show irregularities 
in the form of ripples or undulations statisticians often tabulate 
moving or “running*' averages as a means of showing the general 
trend of the observations. By taking an average of, say, five con¬ 
secutive values, the ripples are greatly reduced. This can best be 
illustrated by a consideration of the series shown in Table XVIII. 

The first column is a series of values written down at random. 
The next column gives moving averages, each of which is the 
average of the corresponding value in the first column and the 
values on either side. Thus the average of the first three items in the 
first column is 4; this is written on the second line. Similarly, the 
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average of the second, third and fourth is 3*66, and so on. These are 
examples of moving averages as used by statisticians; it is usual, 
however, to take the average of more than three terms, twelve 
being a common number in connection with monthly observations. 


Table XVIII 


4 




I 

4 



7 

3*66 

•00 

op 


3 

7 

6 *ii 

6-33 

II 

7*66 

8 

7*26 

9 

9*33 

7*66 

7.70 

8 

6 

7-44 

7-52 

I 

7 

7-44 

8 -n 

12 

9*33 

9*44 


IS 

12 



9 





We need not confine our method to the figures in the first column, 
and if we perform the same operation on those in the second column 
we obtain the values shown in the third column. Similarly, the 
last column can be derived from the third. The irregularities of the 
first column have been greatly reduced by the averaging. It will 
be seen later that this would have been more marked if the averages 
had not always related to three values, e.g. if in deriving the third 
column from the second the average of four consecutive terms had 
been taken, and the average of two conse%'.utive terms in arriving at 
the fourth column from the third. 

3. Distortion of smooth values. 

In Table XIX the values shown in the first column have been 
dealt with in a similar manner. The values in the second column 
have been found by averaging each set of four consecutive values of 
1/3., and here it will be noticed that they are not in alignment with the 
values of W3.. This is because the average of an even number of 
values cannot be regarded as corresponding to either of the two 
central values but to a hypothetical value lying between them. 
Consequently the values in the second column correspond to values 
between those in the first column. The third column is derived from 

FMAsiii 12 
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the second by averaging in fives and the values are in alignment 
with those in the second column. Finally, the values in the last 
column are derived from those in the third by averaging six at a 
time and are therefore out of alignment with those values, i.e. they 
are back in alignment with the original values of u^.. 


Table XIX 

Average Average Average 

of 4 terms of 5 terms of 6 terms 



These original values were actually obtained from the expression 
«a, = 1100 + 2a:—5*® 

by giving x the successive values i, 2, 3,... 15, and were therefore 
ideally smooth. The process of averaging has distorted these smooth 
values quite appreciably, the difference in each case being 30*8^. 
The reason for this will be seen later, but the two examples have 
shown that by averaging successive values we tend (i) to reduce 
irregularities and fluctuations and, (ii) to distort values already 
smooth. We shall deal first with the second of these two features. 

4. [n] or “summation 

In Mathematics for Acttuarud Students, Part ii, pp. 114 et seq., 
the operator [«] or “summation n” is defined thus: 

~ 2 2 2 2 


whether n is odd or even. 









SUMMATION n 


If n is odd the central term is iiq, and is simply the sum of 
n consecutive terms, the central one of which is i/q. 

Thus [5] Uq = w_2 + 4 - Wo + Ml + Wg. 

If n is even the two middle terms are w_| and w^, and Mq itself 
does not appear. Nevertheless [wJwq still represents the sum of 
n consecutive w’s with an equal number lying on each side of Wq, e.g. 

[6 ] Wq = w_| + w_| + w__j 4 + W| + W|. 

In Table XIX the second column was obtained by the 
operation ~ , the next was obtained by the operation and the 

last by the operation ^. 

Gauss’s formula may be written 

Wy = Wo 4 rAwo 4 A^w^i 4 (r 4 1)0) A^w^i 4 (r 41)(4) + - • •• 

Also A”^ (^4 = +^ being the argument. 

Hence 


W Wo= A-V rwo 4 r( 2 )Awo 4 r( 3 )A 2 w_i 4 (r 4 ikA^w,.! 

L J T" L -I n+l 

4 (r 4 i)( 5 )A^w ,2 ^.1 

. “■ 2 

="«o+-4ijr-^45 ! 
ignoring sixth and higher differences. 

It is usual to write b for the operator A^E~^ and the above result 
is then written in the form 

[w] I 19) ,v 

n ^ { 24 1920 I ^ ' 

on replacing Wq by the more general symbol u^. This result is 
important and should be memorized. 

We see therefore that by taking the average of successive values 

y|2 _ 1 

wq. introduce distortions of- bu^ (the second difference error) 

24 

and ^b^u^ (the fourth difference error). 


12-2 
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S. Second and fourth difiTerence errors. 

Most of the well-known formulae for graduation by summation 
involve three operators, often denoted by [/], [m] and [n]. 

Since finite difference operators obey Ae ordinary laws of algebra 
within certain well-defined limits: 


Imn \ 24 1920 


(i I ■(>«»-i)(ot»- 


24 


1920 


/ I 24 1920 I 


ignoring sixth and higher differences, 


/*+»!*+«*—3 
1 +-- 

24 


ft+Ai*, 


where A is an algebraic function of /, m, n which can readily be 
evaluated numerically in any given case. 

For the moment we shall ignore the fourth difference error and 


deal only with the second term- -bu^- 

24 

In the example on p. 178, /, tn and n were 4, 5 and 6. 


Hence 


3_74_37 

24 24” 12' 


Also buj. (i.e. =5= —10, third and higher orders of differences 

vanishing. 

Hence for all values of x the second difference error is — 

This reduces to — 30*83, the actual distortion produced. 


6. Choice of operand. 

With one important exception it may be said that formulae 
used in practice introduce no second difference error. Even in 
the exception which will be discussed later the second difference 
error is small. 

To eliminate the error we operate not on itself but on an 
expression known as the operand', this operation will counter¬ 
balance the second difference error introduced by the successive 
summations. 



CHOICE OF OPERAND l8l 

For instance, for three summations [/], [m] and [n] the operand 


ignoring fourth and higher differences. 

The result of the operations denoted by [/] [m] [n]/Zmn on this 
function will be 

WMWL 

Imn \ 


ignoring fourth and higher differences. 

It will be evident that this gives an unlimited choice of functions 
which satisfy the criterion, and even if we restrict ourselves to 
those which are practical and easily handled the number available 
will be considerable. 

The term operator is loosely applied to the combination of 
the [n] operators and the dividing factors (e.g. [/] [w] [«]//w»), 
although the distinction between operator and operand is somewhat 
arbitrary. The order in which the operations are effected is im¬ 
material. 

rci3 

Consider the operator 

Putting /=w = « = 5 we see that the second difference error in 
the formula is 36. 

Hence the operand must reduce to i — 36. 

Now (l - 3 *) «* = - 3 (»x-l - + «x+l) 

= - 3 «x-i+ 7 «*- 3 «»+i. 
which may be written {lo [i] — 3 [3]}«a,. 

Thus the formula 

-,--7{ioW-3[3]}«» .(2) 

will not introduce any second difference error. 

This is known as King’s form of Woolhouse’s formula and is of 
historic interest. 




24 j 

Z^ + m^ + n^ — 3 , 
+- -I 

24 




Z® + m2 + n^-3 - 

- -b Uj, 

24 
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Alternatively, it will be found that the operand 
- Wa:-2 + + + 

also reduces to {i -“3A«a;}> ignoring fourth and higher differences. 
Writing this operand in the form 2 [3] — [5], we have the formula 

[^{2[3]-[S]}«* .(3) 

(Higham’s formula). 

These formulae have the same operator and neither introduces 
any second difference error; the second however is greatly superior 
to the first in the way in which it enables superimposed errors to 
be dealt with. 

We have seen that the operator [n] does not involve first or third 
differences: hence the operand must also exclude them and must 
therefore be of the form 

+Cl K-i+«*+i)+^2 («^-2+ Ux+i) +• • •; 
i.e. terms equidistant from the central value must have the same 
coefficient. 

From Gauss’s formula 

+ + + .(4) 

another result which should be memorized. 

By means of this formula it is a simple matter to find operands 
which, when combined with a given operator, will produce no 
second difference error. 

The following are examples of the use of both formulae. 


Example 1. 

Find a suitable operand to combine with 


[4] [5] [6] ^ 

120 


Now 


[4][5][6] _^ , 4^ + 5^ + 6a-3 ^ 
120 24 

= i+ 3M 


Hence the operand should reduce to which is rather an 

awkward expression. If we are prepared to ignore the small distortion of 
obtain a much simpler formula. 

Suppose that we decide to have an operand involving Ux^2 ^o Wa+t, i.e. 
of the form ^ (Ux^i 4- Ux+i) +rg 4- Wx+f)- 
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Using formula (4) this reduces to 

(cq + 2Ci + 2C2) ttaj + (^1 + 4^2) +terms in 6 ^^., etc. 

If this is to reduce to (i — 3^) we have 

Cq-^ 2 Ci + 2 C 2 = 

^1 + 4^2 =-3/* 

Solutions are Tq = = i and Cg = — i. 

We have therefore the formula 

[4] [5] [^] f ^ - I , - 4. f/ j_ f 1 _ f/ \ 

J ^ ^x -2 ^x -1 4 * Wjj + Wx+l *^*+2/» 

.(5) 

(Hardy’s Friendly Society formula). 

Although there is a slight second difference error here, this formula was 
used very successfully for the purpose for which it was devised and is the 
best-known formula involving such an error. 

Alternatively, we could have written the unknown operand in the form 

{Ki [i] + ATg [3] -I- ACg [5] +...}i/jp, the K*& being constants. 

If we decide to restrict its range to five terms (i.e. to ignore [7] etc.) 
we can use formula (i) to reduce this to the form 

{K^+K, 3 ii +V>)+K,s{i +b)}u,={{K,+ 3 K, + 5 K,)+{K>+ 5 K,)b}u,. 
If this is to reduce to {i - 36} u^., we have 

K, + 3 K,+sK,= i| 

K^+sK, =-3)’ 

giving as possible solutions ^^1=0, 1^3 = 2, ^^5= — i, i.e. the operand 
[3]-[5]} "a; as before. 


Example 2. 

Find a suitable operand, involving terms to *^x+3> for use with the 
,l 5 ][i 3 l 


operator ■ 


65 


This is perhaps the best-known example of a two-term operator. 

As has been said previously, to achieve smooth results, operators 
involving three summations are usual. This two-term operator was 
deigned for a special purpose. 


65 


4 = 1+ 84 « 


24 
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The operand 

«o «*+Cl (u^i +«x+i )+Ct (Ux-t +«*+i)+ c» ("«-* + «*+a) 

reduces to 

(^0 + 2 Ci + 2C2 + 2C2) tt* + (^1 + 4 ^ 2 +9^3) bux» 
to third differences. 

If this is equivalent to (i — 8&) we have 

Cq + 2Ci + 2C2’\‘2C2=^ I 
and Ci+4r2 + 9^8 =“8 

Convenient integral solutions are = i, tg = o, = — i, giving the 

formula 

{ - »»-S + «*-l + «* + ttx+l - »x+»} 

or ^^{b] + [5]-l7]}«. .(6) 

(Hardy’s “wave-cutting” formula). 


7 . Calculation of fourth difference error* 

Hitherto we have concentrated on the elimination of the second 
difference error. It is, however, important to know what fourth 
difference is introduced. 

As an example we shall consider Spencer’s 21-term formula, 
probably the most famous and generally satisfactory of all sum¬ 
mation formulae: 

5 ^([il+[ 3 !+[s)-[ 7 l)«i. .( 7 ) 

The operand can also be written thus: 

{- «;_8+«;_!+2«;+«;+i - «;+8}. 

and formula (4) of this chapter can be used to expand this in terms 
of differences. We shall, however, use the formula 
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6* 
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ignoring etc.; dus formula would in any case have to be used 
for the; operator. 
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Substituting in (7) we have 

= +264-i^}{i + 3(i+i6) + 5(H-6 + i6^) 

-7(14-26 + 62)}^;;. 

= Hi + 2 ^+if*"+-}{i +26 + 62}{2-86-662...}< 

= .( 8 ) 

Thus there is no second difference error, but a fourth difference 
error of 

8. The range. 

The meaning of the “range” of a formula is almost self-evident. 
Briefly, though not strictly accurately, it may be said to be the 
number of ungraduated w’s involved in the calculation of a single 
graduated value. The exception to this definition arises if it is found 
that some of the coefficients are ^:ero when the formula is fully 
expanded (see para. 10). For instance, the range of the expression 

- + wi-i 4 4 w'+1 - 1/;+8 

is seven, although only five terms are apparently involved. 

The range can be found easily as follows. Find the range of the 
operand by inspection and add /—i, i, n—i, ... etc. for the 
operations [/] [tn\ [w].... 

The following diagram illustrates the effect of the operator [5] 
on a 7-term operand, each term being represented by a dot. 

.7 terms 


Total.II terms 

The first line represents the operand and each subsequent line 
represents the effect of increasing the argument by i. The total 
therefore represents the effect of the operator [5] and it will be seen 
that the original 7 terms have been increased to ii. The process 
is, of course, quite general and can be applied to several operators 
in succession. 

Thus, in Spencer’s formula above, the range of the operand is 7. 
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Hence the range of the formula=74*44-44*6 = 21 terms. 

Similarly, the range of Hardy’s “wave-cutting” formula is 
23 terms. 

It is sometimes said that in using a summation formula one 
assumes that the underlying true function is a polynomial of 
the third degree. This is not correct. Nearly all these formulae 
involve no second difference error, but they will involve a fourth 
diflFerence error unless the true values of the function being 
graduated have negligible fourth differences over the range of values 
covered by a single application of the formula. Thus the use of 
Spencer’s 21-term formula does not imply that the function follows 
a third-degree curve over its entire range, but merely that any 
given set of 21 consecutive values can be represented with sufficient 
accuracy by a polynomial of the third degree. A different polynomial 
will usually be implied for each different set of 21 terms. 

It will be seen therefore that, other things being equal, the 
shorter the range of a formula the better, because 

(1) it is easier to apply; 

(2) the assumption that fourth and higher differences are 
negligible over the range is more likely to be accurate; and 

(3) a smaller number of terms at the ends remain to be filled in 
by other methods. 


This last point is important and will be dealt with more fully in 
a later section. For the time being it is sufficient to point out that 
if a formula has a range of n, the first graduated value produced 


corresponds to the ^-^^th ungraduated value, leaving ^—- values 


at the beginning and similarly 



values at the end to be filled in 


by other methods. 

hi® 

Thus in Table XVIII the operator —- with a range of 7 left 

3 terms blank at the beginning and end of the graduated values. 
Similarly in Table XIX, where the range of the formula was 13, 
only 3 graduated values could be obtained out of 15. 
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9. Effect of a summation formula on superimposed errors. 

Hitherto we have considered only the way in which a summation 
formula affects the underlying true values and we have seen that it 
is a simple matter to ensure that, apart from the fourth difference 
error, it will reproduce them without distortion. 

The whole purpose of the graduation is to eliminate the super¬ 
imposed errors as far as possible. The tests now to be discussed deal 
with this aspect of the problem. 

A complete analysis of a formula includes an investigation of the 
following features: 

(1) The range. 

(2) The second difference error. 

(3) The fourth difference error. 

(4) The error-reducing power of the formula. 

(5) Its smoothing power. 

(6) Its ‘‘wave-cutting** properties. 

Of these, (i), (2) and (3), which deal with the underlying true 
values, have already been described, and (3) is probably the least 
important. To investigate all the points detailed it is necessary to 
expand the formula. 


10. Expansion of a formula. 

Any summation formula can be written in the simple forms 

K,u:,+K^{u 

x+l + ^®-l) + ^2 + ^x- 2 ) + • • • 

Q j "h {^x+r ^x-r) (range odd) 

Ki («'+|+ Mi_j) + Kf (tt^i ++... 

+ (u'x+r+i + (range even). 

Incidentally it is interesting to note that, although every summa¬ 
tion formula can be so expressed, there are an infinite number of 
formulae of the expanded type which cannot be derived from 
summation formulae. They may nevertheless be excellent for 
graduation purposes and anyone interested in the subject is referred 
to Dr Sheppard*s paper in JJ.A. Vol xlviii. 

Summation formulae owe their importance to the ease with 
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which they can be applied, but with modem mechanical aids this 
is not now as important as formerly. 

The actual process of expansion can best be demonstrated by 
examples. We shall first consider Spencer’s i s-term formula (not 
to be confused with his 21-term formula given previously) 

+ + ^.( 9 ) 

The operand can be expressed as 

~ 3K+2 + 2K+1 + 4 < + 3 wi-i - 3<-2- 

This is the most convenient form for our purpose. The method 
of detached coefficients is almost always used and care should be 
taken to insert zero coefficients for missing terms. In the first 
method of expansion demonstrated it is also assumed that there 
are zero coefficients at each end of the operand. The work is best 
set out in tabular form as follows: 
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Hence the expanded formula is 

Tk { 74 “i + 67 («i+l + "i-l) + 46 K+2 + «i-2) 

+ 21 (m;+3 + <-z) + 3 («*+4 + K-*) - 5 K+S + «^8) 

- 6 K+e+«i-«) - 3 K+7+ 

where as before represents the ungraduated value ( = in 
the previous notation). 

It will be seen that in effect the formula has been applied to i 
with the important proviso that an unlimited number of noughts 
could be assumed at either end. In applying the formula to observed 
data it is not of course possible to make this assumption, so that 
we finish with far fewer terms than we started. The assumption of 
zero coefficients at each end in the above table has the effect of 
apparently increasing the number of terms. 

In actual practice it would not be necessary to obtain the final 
column below the entry 74 or at afiy rate the second 67, since the 
coefficients then repeat in reverse order. The previous columns 
could have been abbreviated accordingly. 

11. Alternative method. 

To illustrate another method of expansion we shall consider 
Woolhouse’s formula 

— (-3«-i+7«o-3«D- .( 10 ) 

This formula is now chiefly of historic interest. 

First we develop the operator as follows, writing coefficients only: 

[5] gives I, I, I, I, I, 

[5? » 1,2,3,4,5,4,3,2,1, 

[s? .. I. 3. 6, 10, IS, 18, 19,18, IS, 10, 6, 3,1. 

Each line is derived from the previous one by summing, very 
much as in the previous example. 

We now have to incorporate the operand, the coefficients of 
which are —3, 7, —3, as follows: 
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The terms repeat in the reverse order after the coefficient 25 and 
need not be written down. 

The expanded formula is therefore 

ik { - 3"*-7 - 2m;_« + 3 «i -4 + 7 «i -8 + 21 m;_8+24«;_1 

+2s«i+24«;+i 

12 . Aiul]r8i8 of an expanded formula. 

It has been stated previously that formulae of the type 

«x = + ^-1 “x-l + ^0«x + -^iWx+l + ^2«x+2 

can be constructed so as to produce a satisfactory graduation 
though they cannot be arrived at by summations. This section 
applies to these formulae as well as to summation formulae. 

A great deal can be learnt merely from the inspection of the 
expanded formula without the calculation of any indices of 
smoothing power etc. 

The range, for instance, is found by considering the first and last 
terms, so that in the examples of paras. lo and 11 the range is 15 in 
each case. 

13. The coefficient curve. 

An important conception in connection with an expanded formula 
is the curve of coefficients, or coefficient curve as it is sometimes 
called. This is obtained by plotting graphically the coefficients 
K_^...Kff...Kr and joining them by a smooth curve of which the 
following are typical. 



The general characteristics are that the curve is symmetrical, 
rises to a peak in the middle, cuts the axis towards each end and 
thereafter lies below it. 

The sum of the coefficients must be unity, so that they will tend 
to be mainly positive proper fractions with a few negative ones. 

Remembering also that gives a second difference 

term of r^bu'^ (i.c. it follows that for zero second 

difference error 2r®A,=o. 



THE COEFFICIENT CURVE 


191 

Hence some of the K^s must be negative, and in order to counteract 
the predominating positive K% they must occur for the higher 
values of r; i.e, the negative coefficients will occur at the ends of 
the coefficient curve where they are weighted with the largest 
values of r. This is not a rigid demonstration; it is a discussion of 
the general form of the coefficient curve. Unusual formulae may 
prove exceptions. 

From an examination of the run of the coefficients (the coefficient 
curve is not actually drawn in practice) it is possible to form an idea 
of how the formula will smooth the superimposed errors and also of 
its wave-cutting power. For clearness we shall deal with numerical 
examples; the reasoning is, however, quite general. 

Consider first Woolhouse*s formula: 

“ tIt { ”” 3 ^ x -7 3^x- 4 “i" 7^x—3 "i" 

+ 246;_i + 25€; + 24€'^i...}, 

where is the error superimposed on the true value thus 
giving the observed value is the graduated error thrown up 
by the use of the formula, ignoring any distortion of the w’s such 
as fourth difference errors. 

Consider a particular observed value 

This error ^17 will first appear on the extreme right of the formula 
giving the graduated error Subsequently it will appear in 
€12, rising to maximum importance in €17 and and finally 
disappearing from the formula after €34 has been calculated. 

Similar remarks apply to the other errors, and it will be seen that 
any graduated error differs from the previous graduated error 
€x-i for the following reasons: 

(1) €^8 has disappeared and €^+7 has appeared for the first time. 

(2) €'_7 to €^+8 inclusive now appear with different coefficients, 
having “moved up ” one; thus every coefficient has been changed 
to Kj^i. 

(i) is only a special case of (2) if we imagine zero coefficients at 
each end. 

Tt follows therefore that if the coefficient curve is smooth, i.e. if 
the successive coefficients K change only gradually, the graduated 
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errors will themselves change only gradually however irregular 
the ungraduated errors e' may be. Since the underlying “true*’ 
values are supposed to be smooth it follows that a formula which 
when expanded has a smooth run of coefficients will produce 
smoothly progressing graduated values. 

The coefficients —3, —2, o, 3,7,21,24, 25,24,... which we have 
been considering do not progress smoothly except at the centre, 
and it is not surprising therefore that Woolhouse’s formula is 
unsatisfactory from the point of view of smoothness. 

14 . Wave-cutting. 

We can also learn something from the shape of the coefficient 
curve quite apart from its regular or irregular progression. 

In the diagrams on p. 190, the first curve rises steeply to a narrow 
peak, while the second rises very gradually to a broad flat top. The 
use of a formula represented by the first curve will mean that any 
particular ungraduated error will have a marked influence on the 
graduated values close to it but very little effect on the others. 
A formula typified by the second curve will spread the effect over 
a wide field. Any given e' will have only moderate influence 
on and values near it, and appreciable effect on more distant 
values. 

Since the €'*s are random errors they will tend to change sign 
frequently, although sometimes a run of several consecutive errors 
of the same sign may arise. An ideal graduation would eliminate 
them completely; a summation formula gives them full weight 
although it spreads them over a larger range of values. A formula 
with a coefficient curve of the first type will tend to localize the effect 
of the errors and a wave in the ungraduated errors will be repeated 
although to a less extent in the graduated values. 

If, however, the coefficient curve is of the second type any 
graduated value depends to a larger extent on the more distant 
values and far less on the near ones. Consequently, a wave in the 
ungraduated errors will begin to have an appreciable effect on the 
graduated values much earlier than with the other type of formula 
and will continue to have an appreciable effect much later, while its 
maximum importance, corresponding to the peak of the coefficient 
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curve, will be greatly reduced. Such a formula is therefore said td 
be a good wave-cutter. 

Woolhouse's formula has coefficients: 

tI 5{-3. -2, o, 3, 7, 21, 24, 25, 24, ...}, 
and because of the marked peak is a very poor wave-cutter. 

Hardy developed a special formula, given on p. 184, known 
as his “wave-cutting” formula, to distinguish it from his Friendly 
Society formula (p. 183, formula (5)). 

The coefficients of the expanded “wave-cutting” formula are: 
■hi-i, -2, -2, -I, I, 4, 6, 7, 7, 6, 5, 5* 5, 6, ...}. 

The centre coefficient is marked with an asterisk and it will be 
noticed that the coefficient curve actually has a trough instead of 
the usual peak and the central eleven coefficients are all 5, 6 or 7. 
This formula is efficient therefore in dealing with long waves.’* 

The action of a summation formula in this respect is similar to 
that of a roller on uneven ground; the local irregularities are almost 
entirely removed by flattening out ridges and filling up troughs with 
earth taken from those ridges, while more extensive mounds and 
hollows are reduced but not eliminated. 

15 . Wave-cutting index. 

We now come to the calculation of certain well-known indices or 
“coefficients”, as they are usually called. The use of the word 
“coefficient” in this connection seems misleading and it is pro¬ 
posed therefore to use the word “index” in this book. 

The wave-cutting index is defined as the sum of the five central 
coefficients. This is somewhat arbitrary and breaks down if there 
is an even number of terms; in this case the sum of the four middle 
coefficients and the next one at either end is taken. 

The rationale is clear from the preceding section, for if the sum 
is large the coefficient curve is sharply peaked and the effect of the 
ungraduated errors is localized (unsatisfactory wave-cutter), while 
if the sum is small the curve is flat topped and the effect of the 
ungraduated errors is widely spread (good wave-cutter). 

The wave-cutting index of Woolhouse’s formula is Hf 

* Vaughan has pointed out that this formula, when applied to short waves, 
may actually “reverse’* them because of the high shoulders of the trough 
referred to in this paragraph. 
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The wave-cutting index of Hardy’s “wave-cujting*’ formula is 

As a general rule it may be said that a combination of two operators, 
one of short and the other of long range, will deal effectively with 
waves. 


16. Error-reducing power. 

Clearly one of the most important functions of a summation 
formula is to reduce the superimposed errors c'. The expanded 
formula however does not itself give the required information. 

Consider, for instance, 

“ *^0(^x-l + • • • + Kf> (^x-r ^x+r)* 

We are not interested in a particular set of c's and the resulting e, 
but rather in the results obtained on the average if the formula were 
applied very many times in similar circumstances. In considering 
a large number of values of etc. we are faced with the 

difficulty that some will be positive and some negative. The only 
satisfactory way of overcoming this difficulty is to deal with the 
root-mean-square deviation or standard deviation of each super¬ 
imposed error. 

We imagine a large number of observations made under similar 
conditions, thus producing a whole series of values of c' at each age 
for which we can then calculate the standard deviations, taking each 
age separately. 

Denote the standard deviation of the various values of by 
the standard deviation of the various values of 6^+^ by and so on. 

By the application of the formula we can calculate the graduated 
errors and their standard deviation Sg^ for each age. 

It remains to find a relationship between Sg^ and the various *S"’s. 

We have shown in Chapter III that if 


a| = <T| + (T2 + a?-i-.., 

(if the variables x, y, t, etc. are independent, i.e. not correlated). 
By a simple extension it follows that if 

s^Kf^x+Kiy+K^t ■¥..., 
di^Klol+Klal+Klai + .... 
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To show this we merely have to take K^x as a new variable X, 
Kiy as a new variable F, and so on, and as this is equivalent to 
altering the scale it follows that the standard deviation of X is 
and so on. The result then follows, and we deduce as a special 
case that 

Sl=KlS^^+KliS^l, + S:,%,)+KUS',U+S',\^^^ 

. (”) 

If each S' is based on a very large number of €*s the values in¬ 
volved in a single application of formula (ii) should not differ 
greatly, and if we assume them all equal to S' we find that 


S^==S'^{Kl + 2Kl+2Kl + ...}. .( 12 ) 

Hence the formula will on the average reduce the superimposed 

errors in the ratio o ,_ 

^j = ^Kl + 2Kl+2Kl+ .(13) 


The right-hand side is known as the error-reducing coefficient 
or index. 

In the more general type of formula ^not derived by summation), 
in which coefficients equidistant from the centre are not equal, the 
same argument applies, but the error-reducing index is 

{Klr + Ktr^i + .- + Kli + lQ+Kl+... + Kr/, 
i.e. the root-mean-square of the coefficients. {K_j. is not necessarily 
equal to i^.) 

The smaller the error-reducing index the more powerful the 
formula. 

For Woolhouse’s formula: 

Error-reducing index=(2(3^ + 2^ -f 3^ + 7^ + 21^ -f 24^) + 25 
as‘ the central coefficient does not occur twice 
= -423. (unsatisfactory.) 

For Hardy’s “wave-cutting” formula: 

Error-reducing index 

= •333- 

(Quite satisfactory, considering the range of 23 terms.) 
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17. Smoothing power. 

This may at first seem synonymous with error-reducing power, 
since errors which are brought closer to zero will tend to progress 
more smoothly. The error-reducing index depends only on the 
size of the coefficients, not on their order, and the formula operates 
by grouping together errors of opposite sign, which tend to cancel 
out. When we come to consider the smoothness of the graduated 
results, the order of the coefficients and the coefficient curve become 
of great-importance. As we have seen, the ungraduated errors 
“move up one” each time the formula is applied to give successive 
graduated values; i.e. each ungraduated error is multiplied in turn 
by each of the K's. Provided that these progress smoothly, so will 
the graduated values, and this effect is independent of any reduction 
in the errors. 

The obvious way of testing smoothness is to consider the various 
orders of differences. It has become conventional to take the third 
order of differences in calculating a smoothing index. This choice is 
arbitrary, but, as will be seen later, it has one important practical 
advantage in that most well-known formulae have an operator 
consisting of three summations. 

As before, we are concerned not with a particular set of errors but 
with the result of a great many applications. We therefore consider 
not the c's but their standard deviations 5 ', which for convenience 
we shall again assume to be equal. 

The general algebraical discussion is rather involved and we can 
best illustrate the argument by a numerical example, using Wool- 
house’s formula 

~ “i" 3^a;-4 l^x-Z 

+ 214-2 + 24C1 +254 

To find A®€, the third difference of the right-hand side, we 
remember that A®=(£! -1s —3^2 Writing coefficients 

only we arrange the work as shown on p. 197: 

The denominator 125 is introduced only in the last line, and 
although the whole formula has been developed for the purpose of 
illustration, it is not necessary to go beyond half-way, as coefficients 
then are repeated in reverse order but with changed signs. 
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X 1 

-3 

-2 

0 

3 

7 

21 

24 

25 

24 

-3 


9 

6 

0 

-9 

-21 

-63 

-72 

-75 

3 



-9 

-6 

0 

9 

21 

63 

72 

— I 




3 

2 

0 

-3 

“7 

— 21 

A* 

rrs { “3 

7 

~3 

0 

0 

9 

-21 

9 

0 


21 

7 

3 

0 

— 2 

-3 





-72 

-63 

-21 

-9 

0 

6 

9 




75 

72 

63 

21 

9 

0 

-6 

-9 



-24 

-25 

-24 

~2I 

-7 

“3 

0 

2 

3 

6 ? 

0 

-9 

21 

-9 

0 

0 

3 

-7 

3} 


If we assume a standard deviation of the errors at each age equal 
to iS', the standard deviation of the third difference error will be 


^^32 + ^'2 + 3a + o2+...4.32 + 72 + 32^ 

or, since each coefficient is repeated (there being no central term), 

= — + 7® + 3® +... + 9H o*), 

the summation stopping at the middle. 

The argument is perfectly general; we first of all expand the 
formula in the usual way and then find the coefficients of 
expressed in terms of the ungraduated errors. 

The standard deviation of is then 

Vsum of the squares of the coefficients. 

Now “ 3^x+ 2 "h 3^®+! "" 

Hence, on the same assumption as before, the standard deviation 
of the ungraduated error is 

S' Vi*+ 3^4-3*+1^ = 5 ' V20. 

The graduation has therefore reduced the standard deviation of 
the third difference of the errors in the ratio 


J 


sum of the squares of the coefficien ts 
20 


(14) 


where the coefficients referred to are those in the expansion of A^e 
in terms of the ungraduated e'’s. 
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In the above example the smoothing index is 


1 + 

ia5 V ao ia5 


I 

or — approx. 


Having regard to modem formulae this cannot be considered 
satisfactory; it should, however, be remembered that the range is 
only 15 terms and that the formula was one of the earliest ever 
constructed. 

Although the assumption made in this and the preceding section 
that the standard deviations of the errors are all equal is not likely 
to be realized in practice, the error-reducing index and the smoothing 
index will still provide useful relative measures for comparing two 
or more formulae. 


18. Second and fourth difference errors. 

When the formula has already been expanded, the second and 
fourth difference errors can conveniently be found by the following 
method. 

From most of the well-known central difference formulae it 
follows that 

K-r + «i+r = ^X-2 + — 

. .(. 5 ) 

Hence 

K^u'^ + K-x {u'j^x + “ifl) + -^2 (“*-2 + “x+2) +.. • + (“i-r + “i+r) 

= +2 i S j 

It should be remembered that we are now concerned with the 
underlying true values and not with the superimposed errors e', 
although the coefficients are the same. 

Owing to the construction of the formula 

The second difference error is 

.(16) 
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The fourth difference error is 

.(X7) 

where the summations extend from the central term of the expanded 
formula to either end and not over the entire range, so that each 
coefficient K occurs once only. 

If there is no second difference error we have ^r^Kj,=o and the 
expression for the fourth difference error reduces to 

.(18) 

For numerical work it will often be found that the more general 
expression (17) is preferable, as r^{r^—i) is always divisible by 12. 
An example of the use of this method will be given later. 


19. Alternative method of finding the smoothing index. 

If only the second and fourth difference errors and the smoothing 
index are required it is unnecessary to expand the formula. 

The two errors can be found by the method described in para. 
5,6 and 7, while the smoothing index can be found by the following 
very elegant method which also throws considerable light on the 
construction of summation formulae. The method is due to G. J. 
Lidstone, who described it in two papers to be found in J.LA. 
Vols. XLi, pp. 348 et seq. and XLii, pp. 106 et seq. Both these 
should be read as classic examples of actuarial literature and 
because of the masterly analysis of summation formulae which 
they contain. 

n —1 n—3 w —3 n—1 

[n]^E~^ + E~ + ...+e'^ +E ^ 

_n—J, 

{i+E+E^ + ... + E^-^} 


E^-i 


(x 9 ) 


Hence a typical formula involving three summations in the 

operator, say, m[m][n] , 

(operand}. 
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may be written 


I+m+n—3 


{operand}. 


ImnL^E ® 


To find the smoothing index we find the third difference; i.e. we 
operate on the formula with A®, giving 

i7m}n-z (operand).(20) 

ImnE 2 


As the operator in the denominator affects merely the suffix and 
not the coefficient of each term it can be ignored. We are left with 

jJ {operand}. 


.(21) 

It should be pointed out in passing that the practice of taking the 
third order of differences means that an operator involving three 
summations can be dealt with very easily. If only two summations 
are involved (e.g. Hardy’s “wave-cutting” formula) it is easier to 
find the coefficients of A® by the method and then difference the 
result so as to give the coefficients of A®. 

From the above expression (21) it is a simple matter to evaluate 
the necessary coefficients of A®€ in terms of the e'’s. It should be 
borne in mind that in the second half of the expansion the co- 
efiicients are repeated in reverse order but with changed signs. 
If the range of the formula is R the number of terms in A®e is 

/?+3 (usually an even number), so that only the first- - need 


D I ^ 

be evaluated if i? is odd, or the first- - if R is even. 

2 

The following example illustrates the method: 


Example 3. 

Find the second and fourth difference errors and the smoothing index 
of Spencer’s 2i*term formula 

^^{[i] + [3] + [5]-[7]}«.. 

All the required information can be obtained without expanding the 
formula* 
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It was shown on p. 185 that this fonnula has no second difference error 
but a fourth difference error of — 

To find the smoothing index we proceed as follows: 




Writing only coefficients of the operand so that BP in the denominator 
can be ignored we obtain: 


t|^(£1’-2£13-JE“+£’ + 2£3-i)(-i,o, 1 , 2 , 1 .O, - 
The expansion can be carried out as follows, evaluating only 


0 - 



12 terms: 


£17 

-2£;12 

—I 0 I 2 I 0 ~I 

2 <3-2 

I 

0 

1 

-2 

— I 

0 

-2 

-I 

2 ... 

- 1 ... 

0 ... 





Total 

-I 0 I 2 I 2 —I —I 

~4 

-3 

-3 

I ... 


The term in and the constant do not affect the first 12 terms. The 
sum of the squares of the coefficients 

= -^,{6x 124.2x2* + 2x32 +42} = -^. 

350' 350" 

Hence the smoothing index = •00626 or approx. 


Example 4. 

Analyse fully Spencer’s 21-term formula. 

For a full analysis it is necessary to expand the formula. This can be 
done as follows, remembering that only the first eleven coefficients are 
needed: 

[si gives I, I, I, I, I, 

[s? gives I, 2, 3, 4, 5, 4, 3, 2, 1, 

[S]* [7] gives I, 3, 6, 10, 15, 19, 22, 23, 22, 19, 15 


• ••• 
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Since the coefficients of the operand are — i, o, i, 2, i, o, — i, 
we have 


— I 

-1 -3 

-6 

-10 

-IS 

-19 

-22 

-23 

-22 

-19 

-IS - 

I 


I 

3 

6 

10 

IS 

19 

22 

23 

22 ... 

2 



2 

6 

12 

20 

30 

38 

44 

46 ... 

I 




I 

3 

6 

10 

IS 

19 

22 ... 

— I 






— I 

-3 

-6 

-10 

-IS - 

Total 

-I -3 

-s 

-5 

-2 

6 

18 

33 

47 

S7 

60 ... 


As a check on the accuracy of the work we have: 

sum of the last line = 2(~ i - 3 —5-... +47 + 57)4*60 = 350, 

so that the sum of the coefficients is unity, as it should be. 

Note: although the denominator of 350 in the formula is left out of 
account in much of the numerical work it should never be overlooked. 

Range. The range is 21 terms. 

Coefficient curve. The coefficients progress quite smoothly and the 
peak of the curve is fairly broad. We should therefore expect good 
smoothing power and fair wave-cutting properties, in spite of the fact 
that the range of the operators is 5, 5 and 7, while for good wave-cutting 
the ranges should differ widely. 

The second and fourth difference errors and the error-reducing index 
can conveniently be calculated at the same time as follows, where has 
its usual meaning as a coefficient in the expanded formula. 


r 

(i) 

3S0/£, 

(*) 

H X 3 SoA:, 

(3) 

(4) 

(s) 

10 

— I 

-100 

— 825 

I 

9 

-3 

-243 

-1620 

9 

8 

-s 

-320 

-1680 

2 S 

7 

-s 

-24s 

- 980 

2 S 

6 

-2 

- 7a 

— 210 

4 

5 

6 

ISO 

300 

36 

4 

18 

a88 

360 

324 

3 

33 

297 

198 

1089 

2 

47 

188 

47 

2209 

I 

S 7 

S 7 

— 

3249 

0 

60 

— 

— 

3600 

Total 

— 

-980 + 980 

asO 

-S31S+90S 

= -4410 

3600 + 6971 
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f* — I 

Column (4) was derived from (3) by multiplying by-, as this 

reduces the numbers involved. Since we could however have 

calculated to arrive at the fourth difference error (see formula 18). 

Second difference error. This is equal to = o. 

r^ir^-i) r^ 

Fourth difference error. This is given by 2 —— -K^ or by 2 — ^^ 
if is zero. 

From column (4) the fourth difference error 


s= b^u = as before. 

It should be remembered, that in finding the second and fourth 
difference errors by this method, the summations extend over only half 
the whole range, i.e. from the centre to either end. This does not apply 
to any other index. 

Error-teducing index. over the whole range = 

Note: the total of the last column is shown in two parts, since the 
square of the central coefficient (r = 0) does not need to be doubled. 

Error-reducing index=yl-Q V17S42 = *378. 

Wave-cutting index. The sum of the five central coefficients is ||^, or 
about *766, indicating only very moderate wave-cutting power. (Cf. -42 
for Hardy’s “wave-cutting” formula.) 


Smoothing index. Since =E* — 3E2 + 3-B — i, we find the coefficients 
of A®€ as follows, writing only the first twelve values and ignoring the 
denominator of 350. 


£» 

-I -3 

-5 

-5 

—2 

6 

18 

33 47 

S7 

60 

S7-. 

-3^ 

3 

9 

IS 

IS 

6 

-18 

-S4 -99 

-141 

-171 

— 180... 

3^ 


-3 

-9 

-IS 

-IS 

-6 

18 S4 

99 

141 

171.. 

- 1 



1 

3 

S 

S 

2 —6 

-18 

-33 

-47 •• 

Total 

— I 0 

I 

2 

I 

2 

-I 

-I -4 

-3 

-3 

1 ... 


Hence the sum of the squares of the coefficients of A*c 


The smoothing index is therefore as before 

= •00626 or xiff 


y 2X 6971 +3600 

(350)" 
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20. Choice of the operators [/]» [m], [n]. 

The smoothing index is 

Vsum of squares of coefficients/v^ 20. 

Hence for the formula 

{operand} 

the smoothing index will have the product Imn in the denominator. 
For a given operand and a given range the sum /+m+«is fixed, since 

range = range of operand4 -/+w + n--3 (see para. 8). 

We can increase the product Imn by making the factors more 
nearly equal, and it might at first seem that the most efficient formula 
would be obtained by making 

/=m=n. 

Although this tends to improve the error-reducing power it tends 
to impair the smoothing power and the same applies if two only of 
the values /, w, n are made equal. 

To prove this we consider expression (21), which can be used 
in finding the smoothing index .The smaller the coefficients found 
by expanding this expression the better the smoothing power, 
especially when it is remembered that the coefficients have to be 
squared in finding the index. 

Now the operator 

_ £m+n _ JJn+f _ 4 4 JJw ^ __ j 

will have unit coefficients if /, m and n are unequal. If, however, 
two of them (say m and n) are equal it reduces to 

EMm _ £2m _ 2E^ + 4- - I, 

which tends to produce greater coefficients when combined with 
the operand. 

If l:=zfn^n it becomes 1, so that the coefficients 

in the expanded formula will be relatively large because of the 
coefficients 3 which are subsequently squared. It follows therefore 
that /, m and n should be nearly equal. [4] [5] [6] as used in Hardy’s 
Friendly Society formula is therefore a good operator. 

Vapghan has pointed out that the result is further improved if 
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since then the terms — and cancel, leavmg only 
six terms in the product. 

He has devised some very interesting formulae on these lines and 
the reader is strongly advised to consult two papers which he read 
to the Institute (reproduced in JJ.A. Vols. lxiv, p. 428 and 
Lxvi, p. 463). 

One objection to formulae in which /-f is that either one or 
all of the summations must be even; the effect of this is discussed 
in the next section. 

A further point clearly brought out by Mr Vaughan’s work is that 
a good operator and a good operand do not necessarily combine to 
produce a good formula; it is in fact impossible to predict how 
they will blend. There is no really satisfactory method of judging 
the efficacy of a formula except by analysing it fully as explained in 
previous sections. As regards its application to a particular experi¬ 
ence the best test is the examination of the results that it produces. 


21. Even summations. 

As previously explained the result of a summation of an even 
number of terms is to produce a result lying midway between the 
two central ones. A second even summation brings the resulting 
values back into alignment, and hence a formula involving, say, the 
operators [4] [5] [6] produces values for integral arguments. If, for 
instance, we operate on crude values of for integral ages the 
graduated values will also relate to integral ages. Occasionally, 
however, it may be an advantage to have one or three even sum¬ 
mations; for example we can then make /+m=n, as mentioned 
above, and if we are operating on values for “ages last birthday” 
the formula will automatically produce values for exact integral 
ages, provided that it can be assumed that on the average 

exact age = age last birthday + 

One important disadvantage of even summations in general is 
that the coefficients tend to be more complicated and it is therefore 
difficult to eliminate a second difference error. 

It will be remembered that the second diflFerence error in [n\u^ is 

—— buj,. Now if n is odd, n — i and n +1 are both even and one 

24 
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is a multiple of 4. Moreover, one of the consecutive numbers n>-1, 

12,« +1 must be divisible by 3, so that the expression —- - is an 

^4 

integer. If n is even the numerator may only be divisible by 6, so 
that awkward fractions arise. In order to produce a convenient 
working formula it may be necessary to introduce a small second 
difference error. For instance, using the operators [4] [5] [6], Hardy 
produced the formula 


which has a second difference error of or . 

Actually a small positive second difference error may be an advan¬ 
tage, because the fourth difference error is always negative and the 
two tend to offset each other. Spencer in fact devised a formula for 
which the second and fourth difference errors would cancel in this 
way if the function operated on were of the form A 4- Hx + BC^. 

It is impossible to ensure that this will occur in general; much 
depends on the particular function under investigation. 

If we consider the coefficient curve we can see easily why the 
fourth difference error is always negative. The negative values occur 
at each end where the values of r (in our previous notation) are 
greatest (see para. 13). 

This ensures that Xr^Kj. shall approximate to zero, thus produc- 

ing little or no second difference error, but making S —^- - Kj. 

negative, since weighting with gives the negative values of K^, 
great importance. 


22, Maximum smoothing power and maximum error-reducing 
power are mutually exclusive. 

It is obvious that any formula which reduces the errors effectively 
will automatically produce smooth results. If a formula is devised 
so as to have the maximum error-reducing power for its range it will 
always be possible to construct a formula of the same range with 
greater smoothing power. The first concentrates on eliminating 
the errors; the second on smoothing them. 
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Consider, for instance, the formula 

~ {*«0 +y («-i+«i) + « («_a + Ka)}, 

where x, y and z are to be determined so as to produce no second 
difference error and the minimum smoothing index. 

(jE5-i)3 

The coefficients of A®€ are given by - - {z +jy + x +y + z}. 

(fncidentally, the weakness of operators of equal range has been 
mentioned previously and the operator [5]® is unlikely to give good 
results.) 

Writing down only the first ten coefficients and ignoring the 
denominators we obtain the following: 

£15 giygs 2 :-{-y+x+y+z, 

-3^1® gives 3^* 

The operators and the constant do not affect the first ten 
terms and the sum of the squares of the coefficients of A^e is 

^{202r2 + 20/+io*2}, 

giving a smoothing index of ^ J-g \'x- + + 2z^. 

Hence we have to make x^-\‘2y^ + 2z^ a minimum. 
Differentiating with respect to jc, we obtain 

.(“) 

Also, since the formula must reduce to the form (i if it 

has no second difference error, we must have 

x +23^ + 25:= I .(23) 

and 3 ^ + 4 ^=-3. ...,..(24) 

This last equation arises from the fact that the second difference 
term in the operand is 

and this has to neutralize the second difference error of 36 in the 
operator. 

Differentiating the last two equations with respect to a?, we have 

dy dz \ 


dx^^die ° 
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dv dz 

Eliminating ^ and ^ from these two equations and equation (22) 

we obtain 3*—4y4-«=o. .(25) 

Equations (23), (24) and (25) can now be solved giving: 




33 


The formula is therefore 


rrP 

^ { 47 «i+27 («-i+«i) - 33 («-2+« 0 }* 
4375 


In actual practice x would probably be taken as f, y as f and z as 
— I, the theoretical result being too cumbersome. 

Tor maximum error-reducing power the formula would have 
been expanded, and instead of x^ + 2y^’\-2z^ the sum of the squares 
of the coefficients would have been made a minimum. 

The resulting formula would have been different, thus illus¬ 
trating the fact that maximum error-reducing power is incom¬ 
patible with maximum smoothing power. 


23. Recent practical developments. 

The practical aspect of these formulae as distinct from the 
theoretical aspect had been very largely ignored until G. J. Lidstone 
and D, C. Fraser contributed two interesting notes to J,LA, 
Vol. Lxvii giving some neat labour-saving devices. 

The reader is referred to the original articles for details, but in 
the numerical example which follows use has been made of an 
artifice suggested by Fraser. If the formula does not involve any 
second difference error then the values obtained by graduating 

where b and c are constants, will be equal to the values obtained 
by graduating and afterwards subtracting {a-\‘bx+cx^). This 
means that the numerical values of u^, can be reduced appreciably 
by a suitable choice of the function a^bx-\-cx^\ after the re¬ 
mainders have been graduated we merely have to add back the 
values previously deducted to produce the required graduation. 

If the formula involves a second difference error a^bx can be 
deducted, but a term cx^ will itself contribute a second difference 


error. 
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24. Graduation of the A 1924 -^ Table, Ultimate Rates. 

The rates for durations 3 and over, “All Classes Combined”, in 
the A1924-29 experience were given by the data for half-ages. 
20J, 21J, etc., and values for integral ages might have been obtained 
by the use of a summation formula involving one or three even 
summations. Actually, in order to produce results quickly, it was 
decided to use Spencer’s 21-term formula, as this is one of the best 
formulae for general purposes. Values for integral ages were deduced 
by visual interpolation for most of the table, but where second and 
higher differences were appreciable a simple finite difference formula 
was used at higher ages. 

The ends of the table were completed by third difference extra¬ 
polation. It is obvious that any summation formula of range R will 


leave 


i?- 


terms at each end to be filled in by other methods. 


Sometimes Makeham’s or Gompeftz’s law is assumed—particularly 
at high ages—but the precise method to be adopted depends on 
the run of the data, and for the A 1924-29 rates a finite difference 
method was found to be satisfactory. In this connection it should 
be pointed out that the report states that the graduation was “made 
to show a more distinct increase age by age in the rates of mortality 
above age 85 than that shown by the statistics”. 

As a numerical example we shall calculate the graduated rates 
(duration 3 and over) for ages 30^ to 64^, using the rough values 
of for ages 20^ to 74J for the combined data for the years 1927-9. 

The ungraduated values of 10® x q^, are shown in the first column. 
Spencer’s 21-term formula is 


[5? [7] 

35° 


{W + [3] + [5]-[7]}«; 


jmn 

350 


{2«;++a;+i) - («;_8+ 


F M A s iil 


*4 



l^nunple 5 . Continuous Experience 1927-9. Durations 3 and over. All classes combined 
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• It was difficult to find a satisfactory second-degree function to deduct from 10® xq^so^ to reduce the numbers myolved in 
die graduation. Eventually it was decided to deduct the linear function 1 10 + io(jc 20 ^) x» To save space uiis fun on is no 

**'Thc same values of the function are added to the graduated values given in column (ii), giving column (12), from which 
the rates for integral ages could be interpolated. 



Z 12 GRADUATION 

25. The A1924-29 Table, Select rate*. 

It is convenient to deal here with the select portion of the 
A1924-29 table although the rates were not graduated by a 
summation formula. 

The select rates for durations o, i and 2 were expressed as per¬ 
centages of the ultimate rates for the same attained age. 

Thus 9[47]+2 was expressed as a percentage of ^4, ult. After an 
investigation of the actual percentages derived from the crude 
select rates it was found possible at duration o to assume a per¬ 
centage of 61 at age 45, increasing by *4 for each year of age 
under 45 to a maximum of 68-2 for ages 27 and under and de¬ 


creasing by *3 for each year of age over 45. 

For duration i it was assumed for all ages that 

?[x)+l = i(?[a+l] + ?®+l)* .(26) 

For duration 2 it was assumed for all ages that 

?[a !)+2 = ’^x+i + .(27) 


It will be noticed that, in each of these equations, the q’a relate 
to the same attained age. 

26. The test applied to a graduation by a summation formula. 

It is impossible by any analytical method to find what con¬ 
straints are imposed by a summation formula, but in Seal’s paper 
referred to earlier an experiment was carried out by the author with 
Spencer’s 21-term formula. As a result he decided to assume that 
about five degrees of freedom were lost. The original paper should 
be referred to for details of the method. The assumption made later 
that Kenchington’s formula results in the loss of six degrees of 
freedom is rather controversial and should be accepted with some 
reserve. 

27. Advantages of the summation method of graduation. 

Qnce a suitable formifia has been chosen or constructed the 
process is purely mechanical and does not require a highly skilled 
operator as does the graphic method. 

The method is suitable for standard tables based on large ex¬ 
periences and can be relied upon to give adequate smoothness, 
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provided that the unadjusted rates themselves progress fairly 
smoothly. An advantage of the method is that results are produced 
quickly. 

Perhaps its greatest merit, possessed by no other method, arises 
in connection with functions such as sickness rates. 

Suppose that 

/(*) = 01^1 (*) + 02^2 (») + (h<f>s (*) + ...+ a^ 4 >r (*). 
where the a’s are constants and x is variable. 

If the functions/, <f>2, ... 4 >r graduated separately by the 
same summation formula it will be seen that the same equation will 
connect the graduated rates. 

Sickness rates analysed according to period of attack (ar®, 2®^, etc. 
in the usual notation) are from their very nature additive before 
graduation. For instance 

= an<j + 

If they are graduated by summation the resulting rates will still be 
additive because each graduated rate is a linear function of un¬ 
graduated rates (see para. lo). 

Thus 2^® (graduated )=sfi (graduated) + z*/® (graduated). 

To illustrate this we shall consider the formula 
M* = K_fU_f + + . . . + K_y + ... 

In a summation formula K_i—Ki, but the argument still applies 
to the more general formula for which this relationship does not 
hold. 

If we apply the same formula to graduate 2* and 2®^® we deduce 
that 2®+2®/® (graduated) 

+^,^i(^;‘i+^;?-i)+^r(»;‘+2;®/®). 

where all the functions on the right-hand side are ungraduated. 
But 2'_®, + 2'_®/® = 2'«, 

*-r+l + *-r+1 — *-r+f 


Hence the right-hand side reduces to 

^_,2'“+/s:_,^.i2'“+x+...+(graduated). 
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28. Disadvantages of the method. 

(1) Although skill is needed in choosing a suitable formula, once 
the choice has been made there is no scope for individual judgment. 
As a result it is impossible to retain any special feature of the ex¬ 
perience (such as a discontinuity in withdrawal rates) which the 
operator might feel to be an essential feature. It is certain to be 
greatly modified in the graduation and might even disappear 
altogether. 

(2) Unless the unadjusted rates progress fairly smoothly the 
results will be unsatisfactory. This means in effect that the ex¬ 
perience must be large. 

(3) The ends of the table always have to be completed by some 
other method. 

(4) The method cannot be used satisfactorily for select rates, and 
since it assumes that the function operated on has negligible fourth 
differences over the range of the formula, its use is in practice 
restricted to ratios such as etc. It is therefore impossible to 
take into account the weight of the exposed to risk at each age. 


29. Illustrative example. 

We shall conclude this chapter with an example which illustrates 
many of the points discussed. 

Example 6. 

The following table is a representative extract of ten values from a 
complete table in which x ranges from 20 to 100. Column (2) gives the 
values of a certain function of x calculated by a mathematical formula. 
Column (3) gives the results of an experimental approximation thereto 
and differs from column (2) only in small superimposed errors. Columns 
(4) and (5) are the results of graduating column (3) by Woolhouse’s 
formula and Spencer’s airterm formula. 

(a) Test roughly the agreement between the theoretical smoothing 
index of each formula and the smoothing power as disclosed by the figures 
in the table. Give possible reasons for any anomaly. 

(b) Suggest very briefly any reason for the fact (not confined to the 
ten ^ues shown) that the graduation by Woolhouse’s formula, which is 
theoretically less powerful than Spencer’s, produces results nearer to 
the true values. 
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X 

(1) 

True value 
of function 

(2) 

Experimental 

value 

( 3 ) 

Graduation of column (3) by 

Woolhouse 

(4) 

spencer 

( 5 ) 

, 5 ° 

691 

663 

681 

675 

51 

696 

719 

688 

681 

52 

706 

702 

698 

692 

53 

726 

728 

723 

714 

54 

762 

749 

760 

751 

55 

821 

844 

819 

811 

56 

911 

919 

908 

902 

57 

1041 

1028 

1040 

1031 

58 

1221 

1220 

1218 

I 2 II 

59 

1462 


H 57 

1451 


To deal with (a) we must first-find the ungraduated errors c', i.e. 
the differences between the true values given in column (2) and the 
observed values, and also the graduated errors, taken as the difference 
between the graduated and the true values. The results including the 
third differences are are shewn on p. 216. 

€i and ^2 represent graduated errors produced by the use of Wool- 
house’s and Spencer’s formulae respectively. 

The smoothing index deals not with the third differences of the actual 
errors A*€ but with the standard deviation of the values of A*e deduced 
from a great many applications of the formulae. As a rough test we 
may however compare the actual values of A^e and A^c' taken positively. 

SA®e' (disregarding signs) =456. 

SI A*€i 1 for Woolhouse graduation = 34. 

S I A®€2 I for Spencer graduation =11. 

A comparison item by item would be useless. 

Thus S I A®€ I has been reduced in the ratio ^ approx, by Wool- 
house’s formula and in the ratio ^ approx, by Spencer’s formula. 

Bearing in mind the limitations imposed on the test by the fact that 
only seven values are available, it may be said that the result for the first 
graduation may be said to be consistent with the smoothing index of 
calculated on p. 198. 

On p. 201, however, we showed that the smoothing index for Spencer’s 
formula is about so that even allowing for the roughness of the test 
the result of produced requires some explanation. An examination of 
the difference table shows that the numbers involved are very small and 
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indicate a lack of significant figures rather than deficiency in smoothing 
power. Had another significant figure been retained th6 result would 
almost certainly have been improved. Even if a series of values is ideally 
smooth the third differences will exhibit irregularities if the values 
are curtailed or rounded off, so that they no longer give the exact 
figures. 

(6) It is debatable whether the word “powerful** applied to a sum¬ 
mation formula refers to error-reducing power or smoothing power, 
properties which cannot both exist to the greatest degree in any one 
formula. In (a) we considered smoothing power, but error-reducing 
power remains to be examined. We have shown on pp. 195 and 203 that 
the error-reducing index of Woolhouse*s formula is *423, while that of 
Spencer*s 21-term formula is *378. 

As a rough test we may consider the total of the errors regardless of 
sign, although the error-reducing index relates to standard deviations 
and not to the results of one experiment. 

From the previous table we have 

Total of ungraduated errors, irrespective of sign = 126. 

Total of graduated errors, irrespective of sign (Woolhouse gradua¬ 
tion) =45. 

Total of graduated errors, irrespective of sign (Spencer graduation) 
= 118. 
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As we know that Spencer’s formula should reduce the unadjusted 
errors c' to about one-third of their former value we are led to suspect 
that the total of 118 is due in part to distortion of the true values. 

The table on p. 217 shows the differences of the true values. 

From our previous work we know that neither formula introduces a 
second difference error. Woolhouse’s has, however, a fourth difference 
error of while Spencer’s has a fourth difference error of 

Hence the graduated errors we have been considering are not the true 
errors which ,we previously denoted by e, but are of the form - s*4+€ 
(Woolhouse) and -12*6+€ (Spencer). Fortunately the fourth difference 
errors are constant so that in the first part of the solution the columns 
headed Ac, b?€ and A^c in the table are correct, although the values from 
which Ac was obtained are misleading. The smoothness of the results 
is unaffected. 

To find the graduated errors c we must eliminate the fourth difference 
errors. The following values will be produced. 


X 

Woolhouse 

Spencer 

50 

-5 

-4 

51 

-3 

-3 

52 

-3 

-2 

53 

2 

0 

54 

3 

I 

55 

3 

2 

56 

2 

3 

57 

4 

2 

58 

2 

2 

59 

0 

I 

Total 

27 

20 


It would be illogical to introduce decimals, as the values were all 
recorded as integers and the fourth difference errors were taken as - 5 
and -12. 

If allowance is made for the small number of values tested the totals, 
irrespective of sign, viz. 27 and 20, are not inconsistent with the smoothing 
powers of the two formulae. 

The relative power is more consistent with the error-reducing indices 
than the absolute power of the formulae. 
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EXAMPLES 8 


I. Analyse and criticize fully the following summation formula, 
evaluating any criteria which would enable you to compare (a) its 
smoothing power, (6) its wave-cutting properties, with those of another 
fonnula 

1/^ _ -1 — 2 Uq + 15 («! + — 8 (M2 + M_2)}‘ 

Illustrate its use by graduating the central terms of the following series: 


112, 103, 124, 118, 106, 127, 115, 115, 
127, 124, 112, 130, 127, 118, 133, 130. 


2. Analyse and criticize the following summation formula: 

{«0 + («1 + «-l) - («3 + »-s)}- 

3. Calculate the values of /, m and n in the summation formula 

35 

The formula contains 15 terms and does not involve any second difference 
error. 

Show how ta apply the formula by graduating the central terms of 
the series 

o, 4, 4, 4, 5, 4, 2, 4, 3, 5, 6, 4, 7, s, 3, s, 7, 6, 9, 8, 5. 

4. State briefly the theoretical basis of formulae for summation 
graduation and mention the circumstances in which the formulae give 
satisfactory results. 

Find the missing part of the following 17-term formula, calculate the 
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smoothing coefficient and discuss the merits and demerits of the formula 
as an instrument for graduation: 



5. A summation formula of graduation has been wrongly written in 
the last two terms of the operand as 

Correct the mistakes, analyse and criticize the corrected formula and 
compare it with any standard formula of the same type of which you are 
aware. 


6. Explain the following statement, which refers to Woolhouse’s 
summation formula 

rcia 

“The errors in the ungraduated values are reduced by the graduation 
to about the values they would have in an ungraduated experience of 
five times the magnitude. The smoothness of the graduated curve would, 
however, be much greater than that of an ungraduated curve based on 
the larger experience. “ 


7. In graduating mortality rates by summation formulae, what 
advantages are gained by graduating separately the exposed to risk and 
deaths? 

Mention the principal objections to graduation by means of sum¬ 
mation formulae. It has been suggested that graduated values of 
can be obtained from the graduated values of and by use of the 
fonnula ^ +(i - a,) ?*+,. 

What is the rationale of the suggestion and how would you calculate 
the values of af if it were desired to secure equality between the expected 
deaths obtained by use of the graduated select g^s and the actual deaths, 
in each year of Assurance within the select period ? 


8. (a) A mortality experience has been graduated by the following 
process: 

(i) The exposed to risk were replaced by a smooth curve of similar 
shape calculated by a mathematical formula. 

(ii) Adjusted deaths were obtained by multiplying the substituted 
exposed to risk by the ungraduated rates of mortality. 



EXAMPLES 


221 


(iii) The adjusted deaths were graduated by a summation formula. 

(iv) Graduated rates of mortality were obtained by dividing the 
graduated deaths by the substituted exposed to risk. 

Discuss the advantages and disadvantages of the method. 

(d) In the graduation of the above experience the unadjusted and the 
adjusted deaths over the range of ages from 45 to 70 were approximately 
constant at 200 deaths in each year of age. The summation formula used 

!=^{[3] + [5]-[7]}«x. 

What are the approximate probable errors, expressed as percentages 
in the ungraduated and graduated rates of mortality at age 57? 

9. The table below shows index-numbers of the price of a staple 
commodity over the years 1872 to 1911 (1891 = 1000). For the purpose 
of comparison with the trend, over the period, of the price of another 
commodity, the price of which exhibits similar features, it is suggested 
that the series be graduated by a sununation formula. Do you approve ? 

Devise a formula which you would consider suitable in the circum¬ 
stances and calculate its smoothing index. 


Year 

Index 

no. 

Year 

Index 

no. 

Year 

Index 

no. 

Year 

Index 

no. 

1872 

1356 

1882 

123s 

1892 

988 

1902 

907 

1873 

1452 

1883 

1255 

1893 

973 

1903 

882 

1874 

H93 

1884 

1266 

1894 

998 

1904 

880 

187s 

1519 

1885 

1288 

*895 

1024 

1905 

930 

1876 

1469 

1886 

1310 

1896 

1047 

1906 

950 

1877 

1404 

1887 

1276 

1897 

1069 

1907 

970 

1878 

1371 

1888 

1173 

1898 

lOII 

1908 

990 

1879 

1352 

1889 

II2I 

1899 

994 

1909 

961 

1880 

1311 

1890 

1070 

1900 

935 

1910 

915 

1881 

1250 

1891 

1000 

1901 

946 

1911 

895 


10. What is the effect of applying a summation formula to a series 
which is already smooth? Illustrate your answer by considering the 
application of the formula 

[4][s][61, 


120 


H2[3]-[5]} 


to a series whose nth term is of the form 

, ai-bn + cn^ + dn^ + enK 




CHAPTER IX 


GRADUATION BY MATHEMATICAL 
FORMULAE. MAKEHAM AND 
ALLIED CURVES 

1. Preliminary considerations* 

Hitherto we have started from the data and derived a more or less 
smooth series of values from them. In this chapter we commence 
with a smooth curve and adjust the constants in the equation to 
the curve so as to secure the best adherence to data. 

Before attempting this line of approach in practice it is necessary 
to bear in mind the source from which the crude data were derived, 
any heterogeneous features and other peculiarities. 

Heterogeneity is usually the greatest stumbling-block, because 
any rate of decrement derived from data open to this criticism is 
unlikely to follow a single mathematical curve over its whole range. 
Further, any graduated table, however derived from the data, is 
liable to be suspect unless used in conjunction with a population 
similar in constitution to that on which the table is based. 

As with the graphic method we can operate either on the exposed 
to risk and decrements separately or on the crude rates of decrement. 
The advantages and disadvantages of each method were discussed 
in Chapter VI. In dealing with mathematical formulae the gradua¬ 
tion of exposed to risk and decrements separately, has the added 
disadvantage that the rates finally obtained will be represented by 
complicated expressions difficult to handle in theoretical work, e.g. 
in dealing with joint-life functions. 


2* Makeham and Qomperta curves. 

The first important contribution towards finding a “law of 
mortality” was made by Benjamin Gompertz, who found that 
could be represented approximately by the formula JBc®, i.e. by the 
successive terms of a geometric progression. He then proceeded to 
find the best values of the available constants B and c. 
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A development of Gompertz^s law was subsequently made by 
Makeham, who adopted the now well-known formula 

/JLa; = A + Bc^. 

With its three constants A, B and c this formula was found for 
most tables to give a satisfactory agreement with the facts, and more 
standard tables have been produced by its use than by any other 
method. A mass of literature, mathematical and otherwise, has 
grown up around the formula, and the student will be familiar with 
the ways in which joint-life functions and problems involving com¬ 
plicated multiple-life statuses can be dealt with if the table used has 
been graduated by Makeham’s formula. So convenient is it that 
its adoption is often justified at the expense of a certain amount of 
distortion of the facts. 

Of recent years it has been increasingly difficult to obtain satis¬ 
factory graduation by this simple formula. Allied formulae, such as 
/lCjb = .4 -f Hx + Bc^y 

have been tried without any very great success. 

3. Preliminary tests. 

Usually the rough data are given in quinquennial or decennial 
groups. Even when this is not so it is desirable to amalgamate the 
entries into one of these groups. 

By doing this we reduce irregularities very considerably, par¬ 
ticularly at the ends of the table, and it is much easier to fom an 
opinion as to whether a Makeham graduation is likely to be success¬ 
ful by examining group rates than by examining rates for individual 
ages. 

It will be assumed therefore that the exposed to risk and decre¬ 
ments are available in groups of five years. 

If the exposed to risk is given in the initial form it must be 
adjusted to the central form E% by the deduction of half the 
corresponding decrements. 

Note. The in is not in any way connected with the c in 
Makeham’s formula. 

In Chapter I it was shown that Hardy’s formula 
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could be applied to the central exposed to risk to give for the 
central point of age of a group and to the deaths to give 
the same age. It cannot, however, be used for the exposed to risk 
in the “initiar* form, since this is essentially a discontinuous 
function. This accounts for the change to the “central** ex¬ 
posed E%, 

The application of Hardy*s formula then gives a succession of 
values of P^. and Px^^x hence, by division, a set of values of fix* 

From these values it is possible to form an opinion whether a 
Gompertz or a Makeham graduation is likely to prove successful. 

If the ratio of successive terms is roughly constant a Gompertz 
curve is indicated, with /Xjp of the form Bc^, To test if a Makeham 
curve is appropriate we form the first differences of successive terms 
and then find the ratios of the differences. The first step eliminates 
the constant A and should produce values roughly in geometric 
progression. 

Sometimes it is more convenient to deal with colog instead 
of fjLxi and since these functions are of the same form the same 
process can be applied to both. 

In the preliminary tests for the graduation of the a{m) and a{f) 
Tables, based on British Offices* annuity experience over the years 
1900-20, the function cologp^ was used, with the following results: 

Males 


Age last 
bL^day 

(I) 

coXogp , 

(2) 

A cologp , 

(3) 

colog 

colog 

(4) 

Acolog/>a+, 

A colog 

(s) 

so 

•00436 

•00176 

1-40 

2*01 

55 

•00612 

•00354 

1-58 

1*26 

60 

•00966 

•00446 

1^46 

1-32 

6 S 

•OI4I2 

•00588 

1-42 

i^96 

70 

•02000 

OII52 

1-58 

1-79 

75 

•03152 

•02056 

1-65 

1^28 

80 

•05208 

•02623 

1-50 

1-34 

8 S 

•07831 

•03520 

I-4S 

I^62 

90 

•11351 

•05719 

1-51 

1-68 

95 

•17070 

•09630 

1-56 

— 

100 

•26700 

— 


— 
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The report comments on this table as follows: “It will be seen 
that column (4) would not be much distorted if an average value of 
1*51 were assumed; this value is too high up to 65, then too low for 
the important ages 70 and 75 and a little too high afterwards. The 
average of column (5) is 1-58 (or 1*51 if we exclude the first and last 
entries), so that the evidence of the table leads us to say that the 
mortality follows the Gompertz law sufficiently nearly to justify an 
attempt at graduation and that is about i *51, or log c about *036... 

The table for females was as follows: 


Females 


Age last 
birthday 

(0 

colog 

(a) 

A colog/>, 

(3) 

colog/>a,+5 
colog !>, 

( 4 ) 

Acolog^e^., 

A colog p, 

(S) 

50 

•00349 

*00087 

1-25 

I-S 2 

55 

•00436 

*001*32 

1*30 

2*00 

60 

*00568 

*00265 

1-47 

1-68 

65 

•00833 

•00445 

1-53 

1-83 

70 

•01278 

•00813 

1-64 

I “82 

75 

•02091 

•01483 

1-71 

1-87 

80 

•03574 

•02775 

1^78 

I *46 

85 

•06349 

•04054 

1-64 

I-2S 

90 

•10403 

•05087 

1-49 

— 

95 

•15490 

— 

— 

— 


The report continues: “This table shows that the later values in 
column (4) are considerably greater than the earlier values. It 
follows, therefore, that there is little hope of a graduation on Gom- 
pertz’s hypothesis. The next column, which supplies a test for 
Makeham’s formula, is better and the values opposite ages 55 to 75 
could be averaged at 1*84 without leading to distortion.** 

As will be seen later, it was found impossible to produce a satis¬ 
factory graduation for either sex, using a single curve for the whole 
range. The above tables are, however, interesting, since they show 
the sort of results that preliminary tests are likely to give in practice. 

4 . The Makeham constant c. 

Tests on the above lines give an indication of the constant r, but 
the following method is better for finding a more exact value. 

FMAsiii 


15 
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Suppose that by the use of Hardy’s formula we have obtained 

crude values of for ages 27 32^, 37 J.62^, 67 the values at 

younger and higher ages being unreliable. 

It is desirable to base the value of c on all these observations. To 
do so we form three composite functions: 

/^27i + 3^32* + 5/^37* + + 5/^47* + 3/^62* + M67i> 

S 2 = /X 32 * + 3/^87* + 5/^42* + 6/X47^ + 5/^62* + 3f^67* + /^62i> 

*^8 = M37* + 3M42» + 5/^47* + “+* S/^57i + 3/^62* + ^67** 

The coefficients i, 3, 5, 6, 5, 3, i are quite arbitrary and might 
equally well have been taken as i, 2,3,4,3,2, i. The reason for their 
introduction will be made clear at a later stage. 

If 

Sq —82 = (c® — I) (I + 3c^ 4- 

and 82-81 = (c®-1)(i + 3c® + 5^:^® + + 5^:^® + 2c^^ + c®®). 

Hence . 

One reason for the introduction of the coefficients rising to a 
maximum in the centre and diminishing towards each end is that 
less weight is thus given to the values /ngT* and which will 
generally be based on fewer data. Most weight is therefore attached 
to the values in the middle of the available range. 

Another important point is that, without some such coefficients 
being introduced, 82 — 82 would reduce to /a67*“’/^32* 82 —8^ 

to /^ 62 i“"/^ 27 i> the remaining values disappearing. The estimate of c 
would not be based on all the available values of fi but on four 
values only. 

Having obtained an estimate of c, log^QC is found and is usually 
rounded off to two or three significant figures. 

6 . The Makeham constants A and B. 

Wherever possible it is desirable to allow for the size of the 
exposed to risk at each age or group of ages, and although this is 
impossible in finding a trial value of c it can be and should be done 
in finding the remaining constants A and B. 

By the application of Hardy’s formula we have a set of values of 




CALCULATION OF MAKEHAM CONSTANTS 227 

Pxi^-x from which the values of used in the preceding section 

were derived. 

If Makeham’s law applies these are connected by the equation 

' Pxl^x = Px{^ + B(f‘)- .(2) 

Once c has been found as above the only unknown quantities 
are A and B. 

Thus, using any two ages, we could form a pair of simultaneous 
equations such as (2) and solve for A and B. This, however, would 
not make use of all the data so we use instead the equations 


.(3) 

and = ASLPg, -f .(4) 


The first of these equations is formed by summing all the available 
values and the second by taking the second summations. 

The two simultaneous equations can be solved for A and B. 

The graduation can now be completed, and although there is no 
need to test for smoothness the usual tests for adherence to data 
must be applied. These may indicate unsatisfactory features, such 
as a large discrepancy between the third summations of the actual 
and expected deaths. In view of the way in which A and B were 
found the first and second summations should show no discrepancies. 

As a result of the tests a new trial value of c may be chosen and 
the constants A and B re-calculated, thus giving a new graduation. 
Finally, the best value of c may be found by interpolation from the 
results of the two trials. For instance, if the discrepancies in the 
third summations of the actual and expected deaths are of opposite 
signs, a value of c might be adopted so as to make the total dis¬ 
crepancy approximately zero. 

At the same time it should be borne in mind that the advantages 
of a Makeham graduation are so great that the statistical tests of 
adherence to data should not be applied too rigorously, and much 
greater discrepancies than would normally be allowed may well be 
counterbalanced by the practical convenience of the formula. 

6 . The Gompertz law. 

As the Gompertz law is the Makeham law in the special case 
whefe 24=0 any method used for the latter can be applied to the 
former. 

X5-a 
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The following simple method is, however, preferable. 

Since P^n^=Bd‘P^, 

we have logPa,/Xj,=*logc+log^+logPa,. .(5) 

Summing for all available values we have 

SlogPj,/*a,=logcS*+SlogB+21ogP^ I .(6) 

. SS log P^fig .—log cS2*+SS log 5 +SS log P, J .(7) 

Thus logc and B can be found in one process. 

SlogP is simply nlogP, where n is the number of observations 
summed; and 

SSlogP=logP(i +2 + 3 + ...«)= ”^”^ ^^ logP. 

Another method is described in the report on the graduation of 
the a{m) and a{f) Tables. This method does not make any allow¬ 
ance for the weight of the exposed to risk at each age and may 
for this reason occasionally give poor results. 

* 7 . Example 1* 

Assuming that preliminary tests have indicated that a Makeham 
graduation is likely to be successful, graduate the following data in 
that way: 

Table XX 


Age-groups 

Exppsed to risk 

Deaths 

40-44 

15.518 

65 

45-49 

19,428 

144 

50-54 

21.594 

219 

55-59 

21,890 

378 

60-64 

19.174 

465 

65-69 

15.775 

557 

70-74 

11,414 

685 

75-79 

6,993 

644 

80—84 

3.276 

471 

85-89 

1,096 

217 

90-94 

201 

67 


In the absence of information to the contrary it must be assumed that 
the exposed to risk are in the **initial” form and must first be expressed 
in the ”central” form by deducting half the deaths. 
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Hardy’s formula ^xH^x 

central point of age of each group. 

The work can be arranged as follows: 


Table XXI 


Age-group 

Exposed to 
risk Ei 

A (a) 

A* (2) 

SP. 

Central 
point of 
age X 

(0 

( 2 ) 

( 3 ) 

( 4 ) 

( 5 ) 

(6) 

40-44 

15.485-5 

3.870 




45-49 

19.356 

2,128 

-1,742 

19430 

47i 

50-54 

21,484-5 

217 . . 

-1,911 

21,564 

52i 

55-59 

21,701 

-2.759 

-2,976 

21,825 

57i 

60-64 

00 

-3.445 

- 686 

18,970 

62 i 

65-69 

15.496-5 

-4.425 

— 980 

15.537 

1 

67i 

72J 

70-74 

11,071-5 

-4,401 

24 

11,070 


75-79 

6,671 

-3.631 

770 

6,639 

77\ 

80—84 

3.040-5 

-2,053 

1.578 

2,975 

82i 

85-89 

987-5 

- 820 

1.233 

936 

87i 

90-94 

167-5 






If the differences are set out as above appears on the same line 
as Wq, Column (5) is obtained by subtracting ^th of the entries in 
column (4) from the corresponding entries in column (2). Although 
decimals were retained in column (2) to show more clearly how the figures 
were derived, all other entries have been rounded off to the nearest 
integer. 

T^ble XXII is obtained in a similar manner to Table XXI. In 
that Table column (6) is derived by dividing the entries in column (5) 
by the corresponding entries in column (5) of Table XXI. 






GRADUATION 



As we now have nine ungraduated values of from which to find a 
trial value of c, we proceed as follows; 

SH'm + S/^sTi + ^M’62* + 5/^67* + 3/^7 %+Ihn = ‘73 
^2 = M 62 * + 3 /^ 57 * *+■ 5 ^ 62 * + ^^ 67 * + 5/^2* ^ 3^77* + ^ 82 * =1*1642, 

*^3 “ M57* + 3/^62* + 5f^87* + 6/^72* + 5^77* + 3/^82+ + ^87* ~ ^ ‘8392, 


Hence 


giving 


■ 53-iSa _>6750_ 

** r» « *” “* * 


5^2-Si -4290 
logioC=*03936. 


573» 


For convenience we take 


logioC^‘ 04 * 
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Then, by using five-figure logarithms, etc. can readily 

be found as follows: 


X 

(l) 

log c*i». 

(a) 

lO* 

( 3 ) 

^^3 

mm 

p , 

(5) 

Si*. 

.(6) 

P * 

( 7 ) 

(8) 

47 i 

5-48951 

3.087 

3.087 

3,886 

3,886 

29 

29 


5-73478 

5.430 

8,517 

4*313 

8,199 

43 

72 

57 i 

5-93999 

8,710 

17,227 

4.365 

12,564 

76 

148 

62i 

6-07910 

11.998 

29,225 

3.794 

16,358 

- 93 

241 

67J 

6-19233 

15.572 

44»797 

3.107 

19,465 

III 

352 


6-24516 

17.585 

62,382 

2,214 

21,679 

138 

490 

n \ 

6-22319 

16,719 

79,101 

1,328 

23,007 

130 

620 

82J 

6-07452 

11,872 

9 o »973 

595 

23,602 

95 

715 

87 i 

5-77184 

5.914 

96,887 

187 

23,789 

43 

758 

— 

— 

96,887 

432,196 

23.789 

152,549 

758 

3425 


The figures P^. and P^/Xa; are one-fifth of the values in column (5) of 
Tables XXI and XXII. Although it is not necessary to divide by 5 in 
this way in finding A and P, it has been done to draw the attention of the 
reader to the fact that the factor i/« in Hardy’s formula had previously 
been ignored. 

The equations for A and B are 

758 = 23,789^ + 96,887 X id^B I 
3425 = 152,549.4+432,196 X lo^Pj ‘ 

The solutions of the equations are .4 = -000,910, P =-000,076. 

So that: = *000,910 + -000,076^® 

where log^o c = -04. 

The following graduated values are then easily calculated: 


Central age x 

47i 

52 h 

57> 

62^ 

67i 

72I 

77i 

821 

87^ 

llx 

■0069s 

’OI048 

•01607 

•02494 

•03900 

•06128 

•09659 

15255 

•24124 

Expected deaths 
sPxfix 

135 

i 

226 

. i 

351 

473 

606 j 

678 

641 

454 

226 


The total of the Expected deaths, 3790, is one more than the total of 
the corresponding ungraduated values. The student vrill find it an 
instructive exercise to test this graduation thoroughly by the methods 
explained in Chapter V. 

8 « Curves allied to the Makeham curve. 

When the general trend of mortality altered and it was no longer 
possible to graduate successfully by Makeham’s law, attempts were 
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made to modify the formula. The most important modification is 

.(9) 

To fit this curve to the data the work is the same as before as far 
as the calculation of the rough values of 

The first differences of these values are of the form a+jSc®, 
where a and ^ are constants, and by working with first differences 
instead of /u.’s the constant c can be found in the usual way. 

A, H and B can then be found from the equations 

, 2SP,^*=.i2SP,+iP£2*P^+PSi:c>=P* I.( 10 ) 

SSSP^/i^=+ HlSLxP^ +PSSSc^P^ | 

Another assumption which has been experimented with is 

/i^=j«a®+nfr'. .(11) 

To find the constants a and b we need four values of /x, at equal 
intervals. We might for instance group the data decennially instead 
of quinquennially; convenient age-groups are 25-35, 35~45> 45~55 
and 55-65. 

The application of Hardy’s formula will then enable crude values 


of /ij. for ages 30,40, 50 and 60 to be found: 

(i^=mc?^+nlfl^ = L + M "j (12) 

fiff,=m(^+nb*^—LK+MX (where .(13) 

,i^=ma^+nb^^LK^+MX^ ’ A=i“). (14) 

(igo^mcfi^+nb^^Li^+MX? (15) 

From these we obtain 


M8oA*6o~/*4o =jC«M(k®+A®—2kA) =£<Af(ic—A)® 

M4o/*eo-/^?o =LM(KrA» + K®A-2K«A*) =LM(k-A)2kA 

and 

Pao/^so “/^4o/*6o=(#c®+A* - #<*A - »cA®) = LM (k - A)® (k +A) 


giving 

and 


2 ^ 

PsoPso-pSo 

/^30/^60“"M40 


(16) 

(17) 


Since the n’i are known k and A can be foimd and thence a and b. 
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The remaining constants m and n are best calculated from the 
equations 

SSP^/x^=mSS^P^+ nYLb^pJ . 

There is an infinity of curves of this general type which can be 
fitted to mortality rates. 

The general procedure is usually the same. Crude values of 
are calculated after suitable groupings of the exposed to risk and 
deaths, and constants having the age as index (such as c in Makeham’s 
formula) are calculated from the values of /tx, suitable arbitrary 
weights being introduced where possible to lessen the importance 
of the /x’s based on more scanty data at the ends of the experience. 
Considerable ingenuity is often required and no general rules can 
be given. 

Once these constants having the age as index are known, or at 
least suitable trial values adopted, the remaining constants can be 
found by equating the first, second and even third summations of 
the actual and expected deaths, as in equations (10) above. It is 
usually unwise to introduce summations beyond the third, since, 
apart from the large numbers involved, too much importance is 
thereby attached to the rates at high ages where the rates are least 
reliable. As an alternative, summations over half the range might 
be used instead of a single summation over the whole range. 

9. Application to and 

Sometimes it is desired to fit a curve to the central rate of 
mortality, instead of to The only modification of the above is 
in the initial stage. From grouped data we can always find the central 
term by the formula 

where n is the class interval. 

Hence from the grouped values of E% and 0 * we obtain not 
and PjjjLije at the central point of age but E% and 6 ^ for the central 
age, e.g. age 32 for the quinquennial group 30-35. Crude values of 
are thus obtained and the rest of the work is as before. 
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Similarly, if grouped values of and not E% are used, the values 
of E^ for the central age obtained by the use of the formula enable 
to be calculated. It will be remembered that the formula 

can be applied to any function, continuous or otherwise, provided 
that fourth and higher differences are negligible. 


10 . Perks’s formulae. 


In a paper in J.LA, Vol. lxiii, W. Perks has given some interesting 
formulae which have produced good results with modern data and 
which represent the most promising attempt of recent years to fit 
a single curve to the whole range of a table. The paper should be 
read carefully by anyone interested in modern developments. 

The principal formulae discussed are 


and 


f^x(prq^) = 


A + Bd^ 
i+Z)c* 


(both functions are used) 


_ A + Bd* 

1+Z)c®‘ 


Perks himself assumed an arbitrary value of c, but it is possible to 
calculate one from the data on the lines of the previous examples. 
As an exercise the reader is advised to evolve a suitable method. 

The remaining constants can be found as usual by the method 
of successive summations, although the denominator presents 
difficulty. To overcome this the equations can be written in the 
form 

fjLg + —A+ 

and iirc“*fXj+/*a,+Z)c®fij.=. 4 + 5 c®/* 

First, second and, if necessary, third summations can then be 
formed, and if c is known the remaining constants can readily be 
found. 

A word of warning is advisable in this connection. The third 
summation is open to objection because of the weight given to the 
observations at ages remote from the mean; this applies with still 
more force to the fourth summation. 
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To make up the required number of equations therefore it may 
be necessary to take the second and third summations over each half 
of the range. This is to some extent fitting the curve in sections 
rather than as a whole, but the tests for adherence to data will reveal 
any weakness in the results. 

11 . Example 2. 

The following is a good example of how a variable denominator can 
be dealt with. 

Graduate the withdrawal rate in the following schedule by the formula 

a 


Duration n 

Exposed to risk 
of withdrawal 

En 

Withdrawals 

Wn 

Rate of 
withdrawal 

Wn 

0 

1600 

240 

•150 

1 

1800 

162 

•090 

2 

1800 

117 

•065 

3 

1600 

80 

•050 

4 

1200 

54 

•04s 

5 

800 

28 

•035 

6 

300 

9 

•030 

7 

100 

4 

•040 


The exposed to risk varies from 1800 at durations i and 2 to 100 at 
duration 7, so that any attempt to find a and h from the rates given in the 
last column would be unlikely to give good results unless proper allowance 
were made for the weight of the data at each duration. 

It is easier to deal with the values of and rather than the rates 
and we first write the equation 

Ej^ b ~\’ti 

in the form (b + n) 

By summing twice, we obtain the equations for a and b: 

6SS +ESn = aIS,E^ J * 
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The necessary calculations are as follows; 


n 

(0 

En 

(a) 

(3) 

Wn 

(4) 


nW„ 

(6) 

j:nW„ 

(7) 

Gradu¬ 
ated rate 

(8) 


1600 

1,600 

240 

240 


— 

•1465 


1800 

3,400 

162 

402 

162 

162 

*0911 


1800 

5,200 

II7 

519 

234 

396 



1600 

6,800 

80 

599 

240 

636 

•0518 


1200 

8,000 

54 

653 

2i6 

852 

•0427 


800 

8,800 

28 

681 

140 1 

992 

•0362 


300 

9,100 

9 

690 

54 

1046 

•0315 

7 

100 

9,200 

4 

694 

28 

1074 

•0278 

Total 

9200 

52,100 

694 

00 

1 

1074 

5158 

— 


Hence 6946 +1074 = 9200a 

and 4478^ + 5158 = 52, looa 

giving a = *2407 

1*6428 j 

The graduated rates can then be calculated and the usual tests for 
adherence to data carried out. The graduated rates are shown in the last 
column above. 

12. Example of the test. 

The application of the x® test to a graduation by curve-fitting 
presents no special features, except that each constant in the 
equation found from the given data results in the loss of one degree 
of freedom. 

For instance, in the example in the previous paragraph two 
degrees of freedom were lost as a and b were found from the givep 
data. 

We should therefore proceed in amalgamating the data for 
durations 6 and 7. 

Since the number of cells is 7 and two constraints have been 
imposed there are five degrees of freedom. 

The value of x® at the foot of the last column is *570 and when 
there are five degrees of freedom the probability of obtaining a value 
equal to or larger than this is about *99. The fit therefore seems too 
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good to be true, due no doubt to the fact that the question is artificial 
and was not based on actual data which, by virtue of the case, would 
have included sampling errors such as are met with in practice. 
No actuary, however, is likely to reject a graduation by a formula 
method because the fit seems too good. 


n 

(1) 

Gradu¬ 

ated 

rate 

(2) 

Expected 

with¬ 

drawals 

(3) 

Actual 

with¬ 

drawals 

(4) 

(4)-(3) 

( 5 ) 

(s)‘ 

(6) 

EnPnQn 

(7) 

( 6 )/( 7 ) ■ 

(8) 

0 

1600 

•1465 

234-4 

240 

5-6 

31-4 

200*1 

•157 

I 

1800 

•0911 

164*0 

162 

-2*0 

4-0 

149-0 

*027 

2 

1800 

‘0661 

119*0 

117 

-2-0 

4-0 

III'I 

•036 

3 

1600 

•0518 

82*9 

80 

- 2-9 

8-4 

78*6 

•107 

4 

1200 

•0427 

51*2 

54 

2-8 

7-8 

49*0 

•156 

s 

800 

•0362 

29*0 

28 

— l-o 

I-O 

28*0 

•036 

6,7 

400 1 

•0306* 

12*2 


•8 

•6 

11*8 

•051 

Total 

9200 

— 

6927 

694 

1-3 

— 

— 

-570 


13. The N.H.L Table. 

For the purposes of the National Health Insurance Act 1911 a 
table of mortality (males and females separately) was required for 
the calculation of Reserve Values, The table was based on the re¬ 
corded deaths in England and Wales for the three years 1908-10 
and an estimated population as at 30th June 1909. 

The results of the Census as at 31st March 1911 were not avail¬ 
able, but an estimate in decennial age-groups was provided. The 
required figures for 30th June 1909 were interpolated from these 
figures and the corresponding figures for the 1901 Census on the 
assumption that the numbers increased in arithmetical progression. 

Thus the 1909 figure was taken as 

population 1901 + *825 (population 1911 - population 1901). 

The figures operated upon were not the usual grouped popula¬ 
tions and deaths but the population at ages x and over and the deaths 
occurring at ages x and over as used in plotting an ogive curve. 
The processes used are very clearly set out in the part of the Report 

* Allowing for the weight of the exposed to risk at durations 6 and 7 the 
graduated rate on amalgamation was taken as I *0315 + J ’0278. 
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which is reproduced in J.LA. Vol. xlvii, pp. 548-59. The reader 
is strongly advised to study this extract, which explains both the 
theoretical and practical aspects very lucidly. 

14 . The OtM], OM(«) and O^nm] Tables. 

The table was based on the experience of whole-life with 
profit policies over the years 1863-93 and has a select period of 
ten years. The ultimate part of the table is therefore sometimes 
called the O^ao) table. 

The and tables are both interesting in that the full 
select period of ten years used in the table was abandoned. 
For the table, sometimes known as the aggregate table, 
the data for all durations were amalgamated, while for the 
the experience for the first five years after entry was excluded. 
Thus the O^, and tables differ from each other because 
of the exclusion of data relating to the early durations. 

The table was based on the experience of whole-life non¬ 
profit policies over the years 1863-93. In this experience it was 
found that a select period of five years was appropriate. 

The O^^®^ and the ultimate portion of the tables 

were all graduated by Makeham’s formula with logio c taken as •039. 

For the select portion of the table it was found possible 

to assume that 

where Ai and Bf are independent of x. 

For details the reader is referred to J.I.A. Vol. xxxviii, pp. 501 
et seq., reproduced in Reprints 1935. 

15 . Re-graduation of the Table by Makeham’s formula. 

The table was not graduated by Makeham’s formula. For 

special purpose Mr G. J. Lidstone re-graduated it by that formula, 
although he realized that a certain amount of distortion would arise. 
The method used is interesting in that it was not necessary to refer 
to the original data; further, it had the merit of giving speedy results. 
Its use should, however, be restricted to the type of problem for 
which it was derived and the graduation of rough data should be 
carried out by the methods previously described. Mr Lidstone put 

cologio/>*=a+j3c* 
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and took the values of cologio/>a. from the table that he was 
re-graduating. 

He then used the following three equations for finding a, j8 and c: 


64 




S cologiop* = 40a + . 

®-26 I 

40 


64 


Of-25 

and 

64 

SSScologio/>j 

a ;=25 


ss cologlop*=a + ■ 


40X41 


40r^®- 


^40x41 X42 


^0. 


c— 1 


CL + pc 


>26. 


2x3 C —I 

where the successive summations of the tabular values were taken 
for ages 25 to 64 inclusive. This was the range of ages for which it 
was desired to fit the Makeham curve; higher ages were relatively 
unimportant. 


16. The use of two curves. Blending. 

Often, particularly in recent years, it has been found impossible 
to fit a single curve to the whole of the data, although one curve 
may have been satisfactory at the younger ages and a second curve 
at the higher ages. Clearly such a graduation is not as satisfactory 
as when a single curve is used; it has, however, the great advantage 
that the rates progress smoothly. The chief difficulty is in passing 
from one curve to the other and this brings us to the question of 
blending and blending functiom. 


17. Blending functions. 

Suppose that a curve has been fitted to the data at the younger 
ages, giving graduated values 

t/J, «;,... ... 

and a second curve at the higher ages, giving graduated values 

^r+2> ••• 1> ^a> ••• 

In Other words the two curves are assumed to overlap and there are 
two graduated values for 

^r+2> ••• *fa—1* 

The problem is to combine or fuse these two values in such a way 
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that the final values pass smoothly from the first curve, which gives 
values up to and including to the second curve, which gives 
values from onwards. 

Assume that a typical blended function is given by the 
equation 

.(^ 9 ) 


where and are not constants but functions of t. 

If is to be a blend of varying proportions of the two «’s we 
make 

.(2®) 


Also, in order to make the blended function merge with the main 
graduations at each end (f=o and ^^—r), we have 


A,.=o| 

=o, = I j 


(21) 


Between these extremes should clearly diminish steadily and 
A,^ increase; and it is usual, though not essential, to make 

= *.(22) 


In this event the values of A are merely the values of k in reverse order. 

(and therefore A,^^) should be a continuous function. Further, 
since it is unity if / is zero or negative and commences to diminish 
as t becomes positive, it is essential in order to make a smooth 


(Ik d^K 

transition that should be zero when I=o, and that — and 

ut ut^ 

d^K ji 

preferably ~~^^y should be either zero or small for the same value. 


so that the blending curve has little curvature at either end. 

To produce a smooth transition from one curve to the other the 
range of values over which blending is carried out must be fairly 
large. Much will depend on the differences between the pairs of 
overlapping values and also on the differences in gradient and 
curvature of the two main curves at the ends of the blending range. 


18. The curve of sines. 

A natural blending function is the sine of an angle because of its 
smoothness and the zero gradient when the angle is a multiple of it. 
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If we take the section AB of the first curve and run it into the 
section Cl) of the second we obtain a curve of the form 



which should be satisfactory for blending. 
To do this we take 




and 


= I_2/_LY) 

\f—r/ for the portion . 

^ I whenl<i(i-r) . 

r—r) for the portion CD . . 
^ / t \4 whent>i(s-r). . 


It will be noticed that, as before, so that the values of 

A and k are the same but in reverse order. 

decreases from unity when I=o to zero when I=r—r. 


^'^T+i _ 4 ^ 

dt (^-r)* 


i£t^i(s—r), and therefore vanishes when t=o; and 



if and therefore vanishes 

when r. 


^•fr+t _4__ 

dfi (s-r)» 


when r) and 


4 

{s-ry 


when r). 


Here again a reasonable value of 5—r vnll make the curvature 
quite small at both ends of the range. 

When t=J(f-r), i.e, at the point JB or C in the third diagram, 


is continuous (value |); 

ts continuous (value —~ 
dt I $~r. 
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For the curve AB, 


^ _ 4 _ 

(f-r)*’ 


whereas for the curve CD, 

(s-rf’ 

From the theoretical point of view, therefore, the curve of sines, 
which is a natural blending function, is to be preferred to the more 
artificial curve of squares. 

In practice the use of either method is likely to give similar 
results. 


20. Polynomial blending functions. 

The polynomial =.a+bt+cfi+dfi 

can be used as a blending function if the values of a, i, c and d are 
found as follows. 

Since i, when t = o and o when t=^s — r\ 

and a + r) + £:(^ ——r)^ =o. 

Also, since at both ends of the range, 


i + —o, when t—o or s — r. 

Solving, we find that 


Finally, 




t=o 


(26) 


and when t=s—r. 

(s-rf 

Thus the curve of squares and, still more, the curve of sines are 
superior as regards curvature at the ends of the range of blending. 
Nevertheless as a natural, instead of a hybrid, blending function the 
polynomial possesses certain advantages over the curve of squares. 
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21. Modified blending functions. 

In using a blending function for values of t from o to ^—r we 
are in effect assuming that #c^/ is unity if t is negative and zero if 
t is greater than j— r. Hence we must take care to ensure zero gradient 
and small curvature when the value begins to change. 

In Tables XXIII to XXVI the values of the functions for t — i 
and /= 13 are inserted to illustrate this point. 

In order to ease the junction with the main curves some writers 
prefer to effect the blending from 

to 

i.e. over j—r— i values instead of s—r. 

Thus = I, =o. 

if or 

. The following values of result: 



if t>^s-ry, 


polynomial: 


I 


3 




Table XXVI shows the values for the curve of squares when 
s—r=a 12. The values and A®k^< will enable the reader to 
judge to what extent the junction with the main curves is eased by 
shortening the blending range. 
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Table XXIV. Blending function — 
curve of squares 

(s-r=i2) 


Table XXIII. Blending function — 1 12/ 72* 

curve of sines when t^\{ 5 -r)\ 
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Table XXV. Blending function — 
polynomial 



u- 

r)=i2 




( tV i 

r ^\8 

'fr+l=l -3 


y 






48 864 


t 




(-1) 

(I’OOO) 

(•000) 

(•000) 

0 

1*000 

-•020 

(-•020) 

I 

•980 

-•054 

-•034 

2 

•926 

-•082 

-•028 

3 

•844 

-•103 

-•021 

4 

•741 

-•II7 

-•014 

s 

•624 

-•124 

•007 

6 

•500 

-•124 

•000 

7 

•376 

-•II7 

+•007 

8 

•259 

-•103 

+•014 

9 

•156 


+•021 


• 

-•082 


10 

•074 

-•054 

.+-028 

II 

•020 

-•020 

' +-034 

12 

•000 

(•000) 

(+•020) 

(13) j 

(•000) 


(•000) 


Table XXVI. Modified bUnding 
function—curve of squares 

(j-r=i2) 



-2/^^^ , whent< 






i'-rr 

-j , when^> 4 ( 5 -r) 

t 


^Kr+t 


(-1) 

0 

0 

0 

SH 

(•000) 

(•000) 

(0) 

(i^ooo) 

(—004) 

(-•004) 

I 

.996 

-•033 

(-•029) 

2 

.963 

-•066 

-•033 

3 

•897 

-•099 

-•033 

4 

•798 

-•132 

-•033 

5 

•666 

-•166 

-•034 

6 

•500 

— •166 

•000 

7 

8 

•334 

-•132 

+ •034 

•202 

-099 

4--033 


9 

•103 

-066 

+ •033 

10 

•037 

—033 

4--033 


•004 

(-•004) 

( 4 --029) 

(12) 

(•000) 

j 

(•000) 

(4-004) 

(13) 

(•000) 


(•000) 


For explanation of figures in brackets see p. 244 . 
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22f Limitatioas of blending—alternative methods 




Blending can be relied upon to give good results if the two main 
curves do not intersect as in Fig. 6 or if they intersect twice as in 
Fig. 7. The curve represented by the dotted line JBF should have the 
same tangent as CD at E and the same tangent as AB at F. 

In the second instance the curve passes through A and B. 



When the two main curves intersect only once, as at the point O 
in Fig. 8, a blending process is unlikely to give the best results. 
A good blend is indicated by a curve such as EPF, but any blending 
process normally employed makes the curve pass through the point 
of intersection O and drags it out of its natural course. (This does 
not apply of course if there are two points of intersection.) 
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Generally speaking, therefore, when the two main curves intersect 
only once near the point where blending is to be effected it is pre¬ 
ferable to pass from one curve to the other by a process of osculatory 
interpolation. This process is described in Mathematics for Actuarial 
Students, Part II, Chapter VII, and will be met with again in Chapter 
X of this book. Briefly it may be said to be a process of interpolation 
which ensures a smooth join at each end of the interval in which the 
values are being inserted. 

23 . OflSces Annuitants Experience, 1900 - 20 . The a (m) and a (/) 
Tables. 

The rates of mortality adopted as a basis were obtained by a 
process of extrapolation from the rates for the 1863-93 experience 
and those for 1900-20. They were therefore fairly smooth without 
any graduation, but apart from the desire for more places of decimals 
it was decided for reasons of practical convenience to attempt to 
fit a Makeham or Gompertz curve to the rates. 

The problem was more akin to the re-graduation of an existing 
table than to the graduation of rough data. In any event the exposed 
to risk and deaths for the years 1900-20 did not relate to the extra¬ 
polated rates which formed the basis of the tables. . 

The constants were therefore found from the rough rates 
(colog px) and not by reference to observed data. 

More than one attempt was made unsuccessfully and for details 
the reader is referred to the official report. The Mortality of 
Annuitants, 1900-1920. 

The function operated on was coXogp^, which was assumed to be 
of the form A-\-Bd^. A and B therefore have not the same meaning 
as the constants in the formula for 

The methods of graduation finally adopted were as follows: 

Females. A Makeham curve was fitted to the data at ages 55, 60, 
65.70.75 and 80. 

This curve gave good values also for ages 50 and 85, but at higher 
ages it overstated the mortality as had the other curves experimented 
with. 

A second Makeham curve was therefore foimd which would 
reproduce the values for ages 80 and 85 given by the other curve 
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and also give a value for age 100 approximating to the ungraduated 
rate at that age. The constants were found direct from the three 
equations colog/)8o 

C0l0g/>85 =j4 + Sc*® 
colog/)ioo=i4+Bc^«« 

The two curves consequently overlapped over the range 80-85 
and intersected at each end. A blending method was therefore 
adopted. Ages last birthday were used in the investigation and it was 
found that on the average the exact age exceeded the age last birth¬ 
day by 4^ months or -375 year. The points of intersection of the 
graduating curves were therefore given by exact ages 80-375 and 
8 S- 375 - 

This accounts for the factor j in the following table, 

which shews how the blending was effected. 


Table XXVII 


Exact 
age X 

(i) 

Main 

graduation 

(2) 

Old age 
graduation 

( 3 ) 

Difference 

( 3 )-(*) 

( 4 ) 

Adjusting 

factor 

(s) 

(4)x(s) 

(6) 

Blended ' 
value 
( 3 )-( 6 ) 

( 7 ) 

81 

•038534 

•038883 

•000349 

•7656 

•000267 

•038616 

82 

•043159 

•043906 

•000747 

•4556 

•000340 

•043566 

83 

•048383 

■049286 

•000903 

•2256 

•000204 

■049082 

84 

•054285 

•055057 

•000772 

•0756 

•000058 

•054999 

8s 

•060952 

•061234 

•000282 

•0056 

•000002 

•061232 


Maks. As the preliminary tests had suggested a Gompertz 
graduation it was assumed that 

colog^,= 5 c*. 

Hence log(colog/>j,)=logB+xlogc, 

n 

and 2 log (colog/**)=(zn+1) log B, 

—n 

for an odd number of terms, or 

=2«logB, .(27) 

for an even number of terms. 
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Also S X log (colog/)^.) = log c S .(28) 

—n —n 

Taking the origin at age 75 and using 5 years as unit the work 
was as follows: 

Table XXVIII 


Age 

log (colog/>«,) 

n 

ft log (colog pgt) 

so 

— 2*360 

-5 

4 - - 

11 *800 

ss 

- 2*213 

-4 

8*852 

60 

- 2*015 

-3 

6*045 

6s 

- 1*850 

-2 

3.700 

70 

- 1*699 

-I 

1*699 

75 

- 1*501 

0 

— 

80 

- 1*283 

1 

1*283 

8S 

— i*io6 

2 

2*212 

90 

- 0-945 

3 

2-835 

95 

- 0*768 

4 

3-072 

100 

- 0*573 

5 

2*865 


-16*313 

1 

1 

32*096 - 
= 19*829 


The equations for B and c were therefore 

iilogJ 3 = — i 6'3I3 or log 5 = — i'4830o;. 

5 

and since S «®=2(i*+2*+3®+4*+5®)=iio, 

-s 

equation (28) became 

iiologC=i9'829 or log C= *180264, 

where C=c®. 

Hence log c=*0360528. 

This graduation was unsatisfactory, and an attempt was made to 
fit a Makeham curve by using the equations 


S cologp,=(2«-l-i)^+B S e® .(29) 

-n —n 

n n 

and S *cologp ,=5 S *c“. .(30) 
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Taking 5 years as unit and C=c®= i’58 the work was arranged 
as follows: 


Table XXIX 


Age 

colog pg. 

n 

n colog p* 

C" 

«C" 

50 

•00436 

-5 

“ + 
•02180 

•102 

- + 
•Sio 

55 

•00612 

-4 

•02448 

•160 

•640 

60 

•00966 

-3 

•02898 

•254 

•762 

65 

•OI4I2 

-2 

•02824 

•401 

•802 

70 

•02000 

- I 

•02000 

•633 

•633 

75 

•03152 

0 

— 

I^OOO 

— 

80 

•05208 

I 

•05208 

1-580 

1-580 

85 

•07831 

2 

•15662 

2-496 

4992 

90 

•11351 

3 

•34053 

3-944 

11-832 

95 

•17070 

4 

•68280 

6-232 

24-928 

100 

•26700 

5 

1-33500 

9847 

49-235 




-•12350+2-56703 
= 2-44353 

26-649 

-3-347 + 92-567 

= 89-220 


The equations for the constants were therefore 

89'22oJ 5 = 2*44353 iii44-26-649B=*76738. 

This graduation was also unsuccessful. The above tables have 
been reproduced to show how practical work should be set out and 
to emp Wize that to find the constants the rates and not the exposed 
to risk and deaths were used. 

Eventually one Gompertz curve was fitted at ages 50, 55, 60, 65 
and 70 and a second Gompertz curve at ages 80,85,90,95 arid 100. 

Actually the first curve was continued up to age 81 and the second 
carried back to age 70, so that the position was as shown in Fig. 6. 

The rates (or rather values of log (colog/>,)) were made to relate 
to exact ages on the same assumption as before and were then 
blended by the curve of squares as shewn in Table XXX. 

The central entry in column (2) is V'S and the other entries arc 
ir!f> ^ and of this value. Colunm (7) represents the 
additions to be made to the values of Graduation I for ages 71-75 
and the deductions to be made from the values of Graduation II 
for ages 76-80. 
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Table XXX 




(Factor)* 

h 

log (colog 

Differ¬ 

ence 

Differ¬ 
ence X h 

Interpo- 

Age 

Factor 

Gradua¬ 
tion 1 

Gradua¬ 
tion 11 

lated log 
(colog jpl,) 

(I) 

(S) 

(3) 

(4) 

is) 

(6), 

(7) 

(8) 

7 * 

•0707107 

•005 

2.3307 

2-38432 

•05362 

•00027 

2-33097 

72 

•2121321 

•04s 

•3644 

•41944 

•05504 

•00248 

•36688 

73 

' 3 S 3 SS 35 

•125 

•3981 

•45456 

•05646 

•00706 

•40516 

74 

•4949749 

•245 

•4318 

•48968 

•05788 

•01418 

•44598 

75 

•6363963 

•405 

•4655 

•52480 

•05930 

•02402 

•48952 

... 

•7071069 

•500 






76 

•6363963 

•405 

•4992 

•55992 

•06072 

•02459 

•53533 

77 

•4949749 

•24s 

•5329 

•59504 

•06214 

•01523 

•57981 

78 

• 3 S 3 SS 35 

•125 

•5666 

•63016 

•06356 

•00795 

•62221 

79 

•2121321 

•04s 

•6003 

•66528 

•06498 

•00292 

•66236 

80 

•0707107 

•005 

•6340 

•70040 

•06640 

•00033 

•70007 


The rates at ages under 50 were not taken from Graduation I, 
which gave very low rates, but were arranged so that they ran reason¬ 
ably as compared with the female rates. 


24. Government Life Annuitants Table, 19(Xl-20. 

Because of the practical advantages where joint-life functions are 
concerned an attempt was made to fit a single curve to the whole of 
the data. This, however, was unsuccessful. The rates at young ages 
were very low, particularly for females, and increased slowly up to 
age 70 and from age 90 onwards. 

Between ages 70 and 90 the rise was fairly rapid. 

As with a Makeham curve the accumulated deviations were too 
large to be ignored, the double Gompertz curve, for which 

cologPj,= McF + NIF, 

was tried, but although a fairly good fit was obtained for the female 
experience between ages 50 and 90 the constant i was less than 
unity. This meant that the second term had little effect over age 70 
but made the rates below age 46 or 47 increase with a decrease in 
age. The male experience was still less adaptable. 
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Finally, the following method was used. 

Maks. A 'Makeham curve was fitted at ages 44 to 70 and an 
ordinary Gompertz curve at ages 64 to 89. These curves intersected 
once only at about age 65 and had to be blended over the range 61 
to 69 inclusive. 


The curve'of sines was used with k = -\ i+cos — |, but as was 

2 \ 10} 

to be expected with the curves intersecting only once the results 
were not entirely satisfactory. It is understood they were sub¬ 
sequently hand-polished to produce a better transition from one 
curve to the other. 

At ages over 90 the agreement with the crude rates was un¬ 
satisfactory and an attempt to improve it by re-calculating the 
constants of the curve merely produced distortion elsewhere. 
Finally, the limiting age was taken as 103 and the rates for ages over 
90 were inserted so as to give a reasonable agreement with the 
ungraduated rates. 


Femaks. A Makeham curve was fitted at ages 40 to 67 and a 
Gompertz curve at ages 65 to 86. The curves intersected once 
between ages 66 and 67 and the overlapping values of colog^a, at 
ages 63 to 71 were blended by the curve of sines as before. The same 
difficulty arose at high ages and the rates at ages over 90 were 
inserted in the light of the ungraduated rates, the limiting age 
being taken as 105. 

To produce a table extending to young ages, English Life Table 
No. 8 was used (males and females separately) to find values of 
for ages 5 and 25. The values for ages 45 and 46 were taken from 
the main graduation, thus ensuring a fairly smooth join. 

It was assumed that q^. at ages under 46 was a polynomial of the 
third degree, and the rates from age 6 to age 44 were inserted by 
assuming constant third differences. 

}5 (from English Life No. 8)=»o« 

?26 ,> » =(i + 2oA+i9oA2+ii4oA*)i/o, 

345 (from main graduation) = (1 + 40A + 780A*+9880A®) «o, 

?4e „ „ =(i+4iA + 82oA*+io66oA*)«a, 

whence Amj, A*«(, and A®«o were found. 
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The rates for ages 5 to 10 were assumed to be constant, although 
the third-difference curve produced a slight dip at this section. 

25. Curve fitting. General remarks. 

As used by the statistician the term “curve-fitting” is usually 
applied to the process whereby observed data (as distinct from ratios 
such as rates of mortality derived from the data) have a mathe¬ 
matical curve fitted to them. There may be a variety of reasons for 
such a step, but we need not consider them in this book. 

A very common example, viz. the fitting of a normal curve, has 
already been considered. It is only necessary to find the mean of 
the distribution (which is taken as origin) and the standard deviation. 

Another typical example of actuarial curve-fitting is afforded by 
the National Health Insurance table mentioned on p. 237. Here a 
mathematical curve was fitted to the exposed to risk in ogive form. 
The actual deaths were then adjusted so as to leave the group rate 
of mortality the same as before and finally a curve was fitted to the 
deaths so adjusted. 

The most impojrtant curves for fitting to statistical data were 
developed by Karl Pearson and bear his name. 

They are solutions of the differential equation 
I df(x) x+a 
■ f{x) dx b + cx+dx^' 

From the available data constants such as the mean, standard 
deviation, measures of skewness, etc. have first to be calculated. 
The theory underlying the reasons for these calculations is too 
involved for discussion here. There are many excellent books dealing 
specially with the subject and the reader is particularly recommended 
to read Frequency Curves and Correlation by Sir William Elderton, 
which deals very fully with the practical difficulties likely to arise. 

It will be sufficient here to mention briefly two general methods 
of approach often referred to in actuarial literature. 

28. Method of least squares. 

Suppose that for a given value of the variable (e.g., the age) the 
difference between the observed value and the value according to 
the curve fitted to the data is e. Suppose further that we can obtain 
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a measure of the standard deviation a of all the c’s if a large number 
of values were available instead of one observed value. 

We then have an observed deviation of € with a standard error <7, 
and if the normal curve applied we could say that the probability 




of an observed error € was kb 

This would apply at every value of the variable, so that the com¬ 
bined probability of a whole set of independent errors arising would 


be Ke and would be a maximum if S 




; were a minimum. 


Consequently the most acceptable curve fitted to the data, i.e. the 
one which produces the most likely set of errors or discrepancies, 

would be the one which made S —= a minimum. 

2 ( 7 ^ 

This is the basis of the Method of Least Squares. 

F. M. Redington has suggested (JSS. Vol. iv. No. 4) that 
instead of dividing each € by a we could instead divide by the 
square root of the expected” value of the function, i.e, the value 
given by the curve. 

That is because a* is usually of the form npq and in actuarial work 
is very nearly nj, the ‘‘expected” value. As Redington points out, 
however, some of the assumptions made may be wide of the mark. 

In fitting a mathematical curve the errors e produced are not 
likely to be random and to follow the normal law. 

Consequently theoretical niceties are out of place; it is quite 
usual to assume that all the as are approximately equal and to make 
a minimum. 

Theoretically this is simple. 

is written down in terms of the observed values and the values 
according to the curve. 


A “I" Bc^ 

For example, in fitting the curve ^ ^ 


we should write 






where is the observed value. 

To make this a minimum we equate to zero the partial differential 
coefficients with respect to By Ky D and r, thus producing five 
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equations for the five unknowns. The practical side of the whole 
subject is, however, much more complicated than is apparent from 
this brief oudine. 

27. Method of moments. 

If /, represents the frequency with which a value x of the vari¬ 
able is observed it is a relatively simple matter to calculate the 
successive moments. 

2*4 2 *% 2 *% 

'^*=- 2 / 7 ’ 


where the are measured from the mean. 

if^, is the expected frequency with which the value x occurs 
in the curve to be fitted, the successive moments can be calculated 
in terms of the constants in the equation of the curve. The constants 
can then be found by equating the successive moments. 

Suppose, for instance, that it is desired to fit the curve A+B(f 


to a set of values of fij,- 
The equations would be 

2jiia,=2.<4+25c®, • .( 31 ) 

2*/i^=2.i4*+2fi*c®, .( 32 ) 

'S,x^lj,j,='SiAx^+'EiBx'^c^, .( 33 ) 


A, B and c can then be found from these equations, which are 
greatly simplified in numerical work if the origin is taken at the 
centre. 

In actuarial work the method of moments is seldom employed 
in this form; instead we equate successive summations. 

As will be seen from the following, the two methods are equivalent 
if the argument * is a linear function such as the age. 

Suppose that we have a series of values 

n 

“i> «a» «»> ••' “n» where 

1 

By summing col. (1) of Table XXXI continuously from the end 
we obtain the terms 

(Mi + «8+...+%), («8 + tt8+- + «n). (%-! + «»). «n 

shown in column (2). 
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Table XXXI 


Function 

First 

summation 

Second summation 

Third summation 

(i) 

(a) 

(3) 

( 4 ) 


ai+ai+...+t<„ 

«1 + 2W* + ... 

Mi+—M| + — «8+”* 





Ui 

««+«i+ •••+“» 

Ut + 2 Ui + .,. + (n-l)Un 

2 2 



«r 

«f+ — +Wn 

w,.+2w,.4i + ...+(n-r+x)Wn 

a.3 

+-Mr+i+... 

2 




, (n-r+i)(«-r4*2) 

2 


“n 

tin 

tin 

tti+tti+tta 

+ 

Wj + 2U2 + ... + ttUf^ 

2.3 r(r+i) 

t<i + —+ + - 'ur 

2 2 

n(n+i) 

2 

«(n + i)(«+ 2 ) 

6 


The total of these is ««„+(b -1) «„_i+(« - 2) m„_ 2+...+«! and 
=Nmi, where Wi is the first moment about the origin. 

Summing again continuously from the end we obtain 

(ttl + 2M2 + 3M3 + ...+wa„), 

(«2 + 2«3 + ... + (b -1) . («„_1 + 2M„), K„, 

I 

shown in column (3). 

The total of these is 




1 

2 


N 

S + S 1 = T K + %). 


where m2 is the second moment about the origin. . 

Siifiilarly, it is possible to express the rth summation in terms of 
the first r moments of the distribution. 

FMASiii 


17 
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Since 


" r(r+i)...(r+0 » 


■ =S(r+0(.) 


■L(»'+<W)J^.- 


the way in which each summation can be written down from the 
previous one is obvious. 

For convenience we have assumed the summations to be made 


from the end, but it is clearly immaterial whether we do this or, as is 
more usual, sum from the beginning. 

By equating the successive summations of the observed values 
and the expected values according to the curve, we are ensuring 
that the successive moments about the mean (or any convenient 
origin) are also equal. 

The most common example is that frequently met with earlier 
in this chapter, where the successive summations of actual and 
expected deaths are made equal. It is sometimes referred to as 
**the method of moments applied by way of successive summations 
of the actual and expected deaths 

The method of moments and the method of least squares can be 
shown to give identical results in many cases and on most problems 
likely to arise in practice the methods would produce similar values 
for the constants. 


28. Advantages of curve-fitting. 

The greatest advantage is that the results are ideally smooth, a 
very important point in rates of mortality to be used for valuation 
purposes and the construction of premium rates. 

The use of some mathematical curves, notably Makeham’s and 
closely allied curves, enables the calculation of complicated func¬ 
tions to be simplified greatly. 

The method often throws considerable light on the way mortality 
is changing from one generation to another and may help in the 
search for a law of mortality. 

In finding the constants of the curve it is usually an easy matter 
to allow for the weight of the exposed to risk at each age or age- 
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group. Thus, although only rough weights are used in finding the 
Makeham constant c, the exposed to risk and deaths can be used 
in finding the other constants A and B. 

Usually the total deviation (actual deaths — expected deaths) and 
the accumulated deviation are automatically made zero or very 
small, so that in the majority of instances a satisfactory fit should 
be attained. 

29. Disadvantages of curve-fitting. 

It is very difficult, particularly with modern data and the hetero¬ 
geneity involved, to find a suitable curve. Once this has been done 
subsequent work is largely routine and automatic. In practice, 
many attempts usually have to be made and quick results can be 
produced only if one of the earliest proves to be successful. It 
was partly for this reason, the desire for quick results, that the 
A1924-29 data were graduated by a summation formula. 

It is doubtful whether a single curve can ever be fitted successfully 
to heterogeneous data. 

30. Conclusion. 

The whole subject affords a vast field for research and many 
functions such as the number of deaths in the crude mortality 
table, have only recently been the subject of experiment. For many 
years, owing to the usefulness of Makeham’s formula, attention was 
focused almost exclusively on and the allied function cologp^, 
but it is possible that other expressions may be used successfully 
in the future. 
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EXAMPLES 9 

1. Graduate the following data by means of Makeham’s first formula: 
lx^=A+Bc^, 


Age-group 

Exposed to risk 
(in central form) 

Actual deaths 

20-25 

15.750 

30 

*5-30 

15,100 

40 

30-35 

12,750 

50 

35-40 

14,700 

60 

40-45 

IL 7 SO 

70 

45-50 

11.750 

80 

50-55 

13,900 

90 


The data relate to the active service of employees of a firm which has 
a fixed entry age of 20 and a fixed retiring age of 55. 


a. Tables of graduated values of have been arrived at by several 
different methods, all of which produce results which are satisfactory 
from the point of view of smoothness. It is stated that the most satis¬ 
factory graduation is that in which the sum of the squares of the differences 
between the graduated and ungraduated values is a minimum. Show 
clearly on what assumptions this statement is based and how the test is 
derived from these assumptions. 

Discuss the applicability of the test to mortality statistics generally. 


3. Blend the two following series of Fy between the values of y = 7 and 
y = 13 by means of (a) an interpolation formula of the third degree and 
(d) the curve of sines: 


y 

pA 
* V 

pB 

y 

pA 

pB 


167 

— 

10 

3” 

312 


200 

— 

11 

325 

364 

8 

235 

220 

12 

395 

420 

9 

272 

264 

*3 

— 

480 




14 

— 

544 


Why is the usual blending method not suitable in such a case ? 
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4. The following table shows certain values of extracted from 
English Life Table No. 10 (Males) and the Government Annuitants* 
Tables 1900-20 (Males): 


Age 

E.L. No. 10 

G.A. 

Age 

E.L. No. 10 

G.A. 

39 

•00531 

•00635 

46 

•00861 

•00832 

40 

•00562 

•00658 

47 

•00925 

•00869 

41 

•00598 

•00683 

48 

•00990 

•00910 

42 

•00639 

•00710 

49 

•01057 

•00957 

43 

•00687 

•00738 

50 

•01128 

•OIOIO 

44 

•00741 

•00768 

SI 

•01206 

•01070 

45 

•00799 

•00799 





Blend the two series to obtain values of passing from the E.L. No. 10 
values at ages 42 and under to the G.A. values at ages 48 and over. 

Criticize the junction effected and state reasons for any unsatisfactory 
feature. Employ an alternative algebraic process to produce a more 
suitable series fulfilling the same conditions. 

5. In a mortality investigation the data are presented in the following 
form: 


Age-group 

Exposed to risk 

Deaths 

20-26 

... 


27-33 

... 


34-40 

... 


90-96 

... 



The exposed to risk {E^ and deaths have each been obtained at 
each individual age x in such a form that q^, = BJE^^ and have then been 
summed for the age-groups shown. It is desired to obtain graduated 
values of fig, by use of the formula fi^^AcP^ + Bc^, 

State how you would proceed and what difference should in theory be 
made in the procedure if it were desired to obtain graduated values of 
nig. by use of the formula nig. = Ac^ 4- Bc^. 
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6. You are required to graduate a mortality experience by the formula 
The following is part of the experience: 


Age last birthday 

Exposed to risk 

Deaths 

11-15 

460 

7 

16-20 

994 

II 

21-25 

2,312 

32 

26-30 

4,824 

54 

31-35 

7,684 

117 

86-90 

19,000 

4 

91 and over 

Nil 

Nil 


How would you proceed with the graduation ? 

The part of the table given should be used for examples of the first 
steps; no arithmetical work on the equations formed is required but only 
a description of the methods used to find the constants. 


7. Obtain graduated withdrawal rates from the following data: 
{a) by a graphic process, 

(b) by fitting the formula: withdrawal rate at duration t — a-^-bty 
and compare the results. 


Duration 

Exposed 
to risk 

Withdrawals 


Exposed 
to risk 

Withdrawals 

0 

1200 

120 

5 


27 

I 

1000 

70 

6 

500 

20 

2 

900 

45 


400 

12 

3 

800 

48 


350 

7 

4 

700 

30 


300 

7 










CHAPTER X 


PIVOTAL VALUES AND OSCULATORY 
INTERPOLATION 

!• Perhaps the method of least general application is that of 
pivotal values and osculatory interpolation, used for the English 
Life Tables Nos. 7 to 10. This was devised by George King for 
the first two of these tables and has been used with slight modi¬ 
fications ever since. A good deal of criticism has been directed to 
the method, which has been described by one leading statistician 
as “the highest of high-class cookery**. It should, however, be 
remembered that it was produced to deal with very special problems 
and these must be borne in mind in assessing its merits or failings. 

Chief of these special problems were the following: 

{a) The data were the population of England and Wales (males 
and females separately) and the deaths of the years 1901-10 for 
E.L. No. 7 and of the three years 1910-12 for E.L. No. 8. Con¬ 
sequently very large numbers were involved and apart from local 
disturbances the progression was already fairly smooth before 
graduation commenced. 

(6) Tables were required for spinsters, married women, and 
widows, as well as for all females combined. 

[c) King was required to produce not only rates of mortality for 
the construction of a life table, but also a “Graduation of Ages**, 
as it was called, i.e. a tabulation of the population age by age with 
inaccuracies removed as far as possible. 

{d) The chief problem to be solved was how to eliminate “local** 
mis-statements of age due to a preference for even numbers and 
numbers ending in 5. 

(e) Medical Officers of Health had been in the habit of comparing 
local mortality with that of other districts or of the country as a 
whole by means of the crude death rates, i.e. the number of deaths 
per thousand, irrespective of the age distribution. It was therefore 
suggested that some simple and easily applied method should be 
devised for their use in constructing mortality tables for districts. 
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The problem is admirably summed up in King’s own words: 

** In constructing the tables it was desirable that a method should 
be employed, simple in theory, easy in application, and which 
would produce curves of smooth graduation, and curves which 
would adhere closely to the original data”; and 

“The table is not intended to forecast the future, but merely to 
give accurately the present populations.” 

Even in 1911 the data exhibited irregularities due to such factors 
as changing rates of birth, migration and mortality. Such irregu¬ 
larities were inherent in the data and could not be removed in 
arriving at an accurate picture of the numbers at each age, as they 
would have been by a powerful method such as was used for the 
N.H.I. table. It should be remembered that for the latter table 
the object in view was a table of smooth rates suitable for the 
calculation of reserve values. 

The separate tables for spinsters, widows, and married women 
also had a restrictive effect on the choice of method, because, as 
King says: “Under my instructions it was also necessary that the 
graduation should reproduce the total population exactly, and it 
was evidently also desirable that the method should be such that 
when applied separately to each of the sections of the population 
which make up the whole, the sum of the populations of the several 
sections at each year of age should be identical with the corresponding 
total population.” 

A summation formula would have achieved this last result but 
would not have dealt satisfactorily with the minor mis-statements 
of age which have always been the distinctive feature of census data. 

The most effective way of meeting this difficulty was to group 
the data quinquennially. Experiments were carried out at each 
census since 1911 to determine which grouping was most success¬ 
ful. For instance, 29-33,34*38> etc., were adopted on one occasion 
and 33~37, 38-42, etc., on another. 

In Chapter I we proved King’s formula 

Uq='Z tVQ — •ooSAho^if 

giving the central t^rm of a quinquennial group in terms of the 
totals for that group and the two neighbouring groups. 
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By means of this formula King derived what he called “ graduated 
quinquennial pivotal values” of jBJ, the exposed to risk at age x last 
birthday, and 0^., the deaths at age x last birthday in a calendar year. 
It should perhaps be emphasized that the graduation was produced 

(a) by grouping the data quinquennially, and 

(b) by deducing pivotal values on the assumption that fourth and 
higher differences of the grouped values were zero. 

Of these {a) was of course by far the more important. 

2, Oscillatory interpolation. 

The pivotal values were not subsequently altered and the inter¬ 
mediate values were inserted by a process of osculatory interpolation. 
Any of the well-known formulae such as Everett’s might have been 
used, but the objection to these was that for each new interval the 
quinquennial values involved were different from those used in the 
previous interval, so that although the curve of values was con¬ 
tinuous the gradient was discontinuous on passing from one 
interval to the next. To overcome this. King used a formula speci¬ 
ally designed to make the gradient continuous. His method of 
deriving this formula, although not the simplest, is instructive, and 
as it usually causes difficulty the method is dealt with in some 
detail in the next paragraph. 

3, King’s formula for osculatory interpolation. 



King decided to use a third degree curve for the purpose of 
interpolation over each range of six values consisting of two con¬ 
secutive pivotal values and four interpolated values. As he had four 
available constants he made this curve not only pass through the 
pivotal values at the ends of the range but also touch certain lines at 
those points. 
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For instance, if the points iV, O, P, Q in Fig. 9 represented pivotal 
values of Ui, U2 and the values Ui.2f Ui.^ and 
inserted between O and P by means of a third degree curve which 
passed through O and P and touched certain lines T^OT^ and 
T2PT2. 

Similarly, the third degree curve used for the interval PQ passed 
through P and Q and touched T2PT2 and a similar line at Q. 

Since both these curves touched the same line T2PT2 they touched 
one another and the gradient was therefore continuous at P. 

The method by which the lines and T2PT2 were found 

is apt to cause the student some difficulty. The lines were fixed 
as the tangents to second degree curves which were used for this 
purpose only and not for the actual interpolation. 

We shall examine the problems in detail. 

The second degree curve passing through N, O and P is clearly 

2 


du^ . 2a:—I.- 

^=A«.+—4%. 


The slope of the tangent T^OT^ is found by putting a: = i; this 

gives .(i) 

Similarly, the second degree curve passing through OPQ is 

* I)a 9 

Ma; = + JcAwi + —- - A^Mj, 

2 


and the slope of the tangent T^PT^ is therefore 

A«i+iA*«i=AM„+fA*M„+i^®«0' .(2) 

Having obtained these tangents we dispense with the second 
degree curves. We now have to find a third degree curve passing 
through O and P and having the above gradients at those points. 
This curve is to be used for the actual interpolation. 

Let its equation be 

+A*++<£»*. 

Since the curve passes through P, 

«j=«i+6+c+rf. 

6+c+<i==Ma—« i=Ami=A«5+A®«o. 


I.e. 


( 3 ) 
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Since the curve also touches T^OT^ and T^PT^, 

(^^)*_o=*=Atto+iAX. from (1) (4) 

(^)^^=6+2<^+3‘^=A«o+fAX+iA3«o.(5) 

Solving these equations we obtain King’s osculatory inter¬ 
polation formula: 

~ AX.(6) 

2 2 

Note: the formula gives (not in terms of the value at the 
beginning of the interval, and the differences of Mq. 


4. Osculatory form of Everett’s formula. 

Everett’s well-known formula can be readily modified so as to 
produce osculatory interpolation "and the student is referred to 
Mathematics for Actuarial StudentSy Part II, pp. 147 et seq, for 
details. 

Formula (6) may be written in the form 

«1+I = Ml + * («2 - «1 - ^ ^ (^X - ^X) 

2 2 


V O v»3 ^2 <y»3 

—xu2+{i —x)ui -AX- 


= ' -AX 


( 7 ) 


where^^=i-ir. 

This differs from the usual formula in having coefficients 
2 ’ 2 


instead of 


3! ’ 3! ’ 


An example of the use of this formula will be given later, but 
King’s formula can be applied as follows to give the interpolated 
values 

^l'2> %* 4 > 


Differencing formula (6) at intervals of i/t we obtain the following 
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results, where A as before relates to differences of grouped values 
and 8 relates to differences at intervals of i/t. 


_Auq _ t+iA%<o t— iA®Mo 

__ _ _ 


8 % = 




When ^~5, these become 

SUi = •2AUq + • I 2A^Uo — -01 6A®Mq' 
S^Ui= •o^A^Uq — •oi6A^Uq 

8 % = •024A3% 


The four interpolated values can then readily be fitted in, but 
in order to check the work by reproducing i/g it is necessary to 
retain three extra decimal places in the 8 *s. Moreover, these differ¬ 
ences have to be re-calculated for every interval, so that the oscula- 
tory form of Everett’s formula is usually quicker, particularly if 
many values have to be inserted. 


Example !• 

Given the following data, calculate the quinquennial pivotal value of 
^39 and interpolate the values ^35 to ^43 by an osculatory formula. 


Age 

(1) 

Population 
at 30th June 

1935 

(2) 

Deaths 
in years 

1934-36 

(3) 

Age 

( 4 ) 

Population 
at 30th June 

1935 

(S) 

Deaths 
in yeara 

1934-36 

(6) 

30 

2270 

27 

40 

2346 

35 

31 

2049 

26 

41 

2048 

35 

32 

2198 

29 

42 

2186 

38 

33 

2192 

28 

43 

2073 

35 

34 

2203 

29 

44 

2009 

37 

35 

2226 

30 

45 

2028 

38 

36 

2252 

30 

46 

1912 

39 

37 

2176 

32 

47 

1858 

40 

38 

2324 

35 

48 

1906 

41 

39 

2241 

34 

49 

1762 

43 
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Age X 

Quinquennial pivotal 
values of q„ 

24 

•00365 

29 

•00387 

34 

•00439 

44 

•00605 

49 

•00810 

54 

•01167 


To obtain the pivotal value of ^39 it is first necessary to group the data 
into the ranges 32-36, 37-41, etc. King’s formula, i/q = 
then gives the graduated pivotal values of the population and deaths at 
age 39 as follows: 


Central 

age 

Group 

population 

A (a) 

AM2) 

Group 

deaths 

A (5) 

A*(S) 

(0 

(2) 

(3) 

(4) 

(s) 

(6) 

(7) 

34 

11,071 

64 


146 

25 


39 


-927 

-.991 

I7I 

16 

-9 

44 

10,208 



187 




The adjusted population at age 39 is therefore 
•2(11,135)- -008 ( - 991) = 2235 
and the adjusted deaths are 

•2(171) - •008 ( - 9) = 34*27 for the three years or 11^42 per annum. 

11*42 

223s 

11*42 

and ^39=-= -00510. 

2235 + 5-72 

To interpolate the required values by the formula 


we proceed as follows: 


2 2 


Value of X 

•2 

Value of coefficient- 

2 

— •016 
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As these coefficients are all multiples of *008 it is convenient to tabulate 
*oo8A^ as follows: 

Table XXXII 


Agey 

qy X 10 ® 

A (2) 

A* (a) 

•oo8A* (2) 

(i) 

(a) 

( 3 ) 

( 4 ) 

(s) 

24 

365 

22 



29 

387 

52 

30 

•240 

34 

439 

71 

19 

•152 

39 

510 


24 

•192 

44 

60s 

95 

110 

0 

00 

00 

49 

810 

205 

152 

I* 2 i 6 

54 

1167 

3 S 7 




Table XXXIII 


Age 

(i) 

oeui 

(a) 

z * 

(3) 

(a)+( 3 ) 

(4) 

f terms 

(s) 

Interpolated 
value (4 )H-(s) 
(6 ) 

29 

— 

— 




30 

87-8 

- -304 

87-496 



31 

175-6 

- -912 

174-688 



32 

263-4 

-1-368 

262-032 



33 

351-2 

-I* 2 i 6 

349984 



34 

— 

— 

— 

— 

(439) 

35 

102 

- -384 

IOI-616 

349-984 

451*600 

36 

204 

-1-152 

202-848 

262-032 

464-880 

37 

306 

-1-728 

304-272 

174-688 

478*960 

38 

408 

-1-536 

406-464 

87-496 

493.960 

39 

— 

— 

— i 

— 

(5*0) 

40 

I 2 I 

-1-760 

119*240 

406*464 

525-704 

4 * 

242 

-5-280 

236*720 

304-272 

540-992 

42 

363 

-7.920 

355-080 

202-848 

557-928 

43 

484 

-7-040 

476*960 

ioi-6i6 

578-576 

44 

— 

— 

— 

— 

(605) 


(For illustration the full number of decimal places is retained.) 
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In performing the actual interpolation only the terms are cal¬ 
culated, since in reverse order they form the ** terms for the succeeding 
interval. Thus the first four items in column (5) of Table XXXIIl are 
merely the first four items of colunrn (4) in reverse order. This is similar 
to the ordinary use of Everett’s formula: see Example 2 on pp. 73-5 of 
Mathematics for Actuarial Students. 

For ages 35-38, u^ = ^34 x 10®, =^^39 x 10®. 

For ages 40-43, u^ = ^39 x 10®, u^ = ^44 x 10®. 

5 . King’s short method of constructing abridged Life Tables. 

For the benefit of Medical Officers of Health and others interested 
in vital statistics the following method was devised for constructing 
an abridged mortality table and values of 

First, formulae were required giving the sum of five values for 
successive ages in terms of the values at quinquennial intervals. 
From the formula 

the following formulae were deduced: 

Uy +«1.J + M1.4 + Mi-J + Ki-g= 5 «o+ 7^«0 +1 •6AX - -zASko.(8) 

and 

“i-2 + “i- 4 +“i-6 + «i-8 + «2 = 5 «o+8 Amo+ 2-6AX - -aA^- • • -( 9 ) 
For the initial group these formulae fail and were replaced by 
«o+«-2++«•* + “-s = 5 «o+2A«o- •4A2 mo + -zAX .(to) 

and 

m.2+m .4 + «.e+«.g+«1=S«o + 3AM0 - •4AX+’ZAX.(11) 

The following gives the subsequent process in detail: 

(i) Calculate quinquennial pivotal values of population and 
deaths and hence at quinquennial intervals. 

(ii) Deduce the values of \o%pj. at quinquennial intervals. 

(iii) Since log5/>4;=log/)^+log/>3^i+... + log/>j5+4, calculate the 
values of logg/*^. from the values in (ii). 

(Formula (10) has to be used for the first interval.) 

(iv) Take a suitable radix for 4 and, using the values in (iii), 
find log 4 at quinquennial intervals. These values will not 
extend to the end of the life table and the last values will have 
to be inserted by some arbitrary but reasonable method. 
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(v) Taking antilogs obtain 4 At quinquennial intervals. Formula 
(9) (formula (ii) for the first interval) then gives 

4+1+4+a+4+8+4+4+4+6 

. in terms of 4 and the differences taken at quinquennial, 
intervals. (This sum of five consecutive values of 4 at unit 
intervals was called by King 

(vi) By summing the values of from the bottom upwards 

obtain ^ / 

at quinquennial intervals and hence 

6 > 

“ S ^zl 4 • 

The addition of *5 gives the complete expectation Sg,> 

Table XXXIV is an example of the use of the'method and is 
taken from King’s report on English Life Tables Nos. 7 and 8. 

The initial processes were straightforward and are not shown. 
They were as follows. From the data grouped in the ranges 4-8, 
9-13, ... 104-108 quinquennial pivotal values were obtained for 
ages II to loi inclusive and the values of — logp, shown in 
column (2) calculated. 

Formula (8) then enabled logsp, to be deduced as far as age 91, 
but loggPw lugspioi were also needed. These were inserted as 
shown by assuming a constant fourth difference for logp^* Any 
other reasonable assumption would have given much the same result. 

Column (7) was obtained by the application of formulae (8) and 
(10). It should be noted that, throughout the work, values at 
quinquennial intervals only were used, the intervening ages being 
allowed for in the construction of formulae (8) to (ii). 

To derive the values of 4 , at quinquennial intervals the work was 
as follows: 

Colunm (13) was obtained by means of formulae (9) and (ii) 
from the values of 4 and the differences. Column (14) was then 
derived by summing from the bottom upwards and the final column 
obtained by dividing the entries in column (14) by the corre¬ 
sponding values of 4 and adding *5. 
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A* (a) 

A* (a) 

(4) 

(5) 



451 5*00000 

100,000 


660 4-99S49 

98,967 


836 *98889 

97.474 


965 -98053 

9S,6i6 


1,222 *97088 

93.515 


1,596 -95866 

90,920 


2,114 *94270 

87.640 


2,872 *92156 

83.476 


4,026 *89284 

78.134 


5,816 -85258 

71,216 


8,416 -79442 

62,290 


12,410 *71026 

5 i» 3 i 7 


18,895 *58616 

38,562 


28,717 -39721 

24,958 


42,001 *11004 

12,884 


60,640 3*69003 

4,898 

14,116 

76,594 -08363 

1,212 

14,116 

97,165 2-31769 

208 

14,116 

203,863 1-34604 

22 


1-30741 

0 
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Table XXXIV 








Cl) 


Age 

4* 

-A (9) 

-A* (9) 

A‘(9) 

K-.zi 


4 


(9) 

(10) 

dO 

(«) 

(13) 

(14) 

(IS) 

XI 

100,000 

1.033 



497.104 

5,166,263 

52-16 

16 

98,967 

4493 

460 

+ 95 

490,521 

4,669,159 

47-68 

21 

97.474 

1,858 

365 

+ 122 

481,918 

4.178,638 

43-37 

26 

95,616 

2,101 

243 

- 251 

4749*4 

3,696,720 

3916 

31 

93.515 

2.595 

494 

- 191 

460,026 

3,224,796 

34-98 

36 

90,920 

3.280 

685 

- 199 

445.074 

2,764.770 

30-91 

41 

87,640 

4.164 

884 

- 294 

426,120 

2,319,696 

2697 

46 

83.476 

S.342 

1178 

- 398 

404905 

4893.576 

23-18 

SI 

78.134 

6,918 

1576 

- 43* 

370,633 

4491.671 

19.59 

56 

71,216 


2008 


330,113 

1,121,038 

16-24 

61 

62,290 

8,926 

2047 

- 39 

279.297 

790,925 

13-20 



10^973 


+ 265 




66 

•51.317 

12.755 

1782 

+ 933 

218,846 

511,628 

10-47 

71 

38.56a 

13.604 

849 

+2379 

151,862 

292,782 

8-09 

76 

24,958 

12,074 

-1530 

•42558 

87.444 

140,920 

6-15 

81 

12,884 

7.986 

-4088 

+ 2 X 2 

38,784 

53,476 

4-65 

86 

4,898 

3,686 

-4300 

— 1618 

12,036 

14,692 

3-50 

91 

1,212 


-2682 


2.348 

2,656 

2-69 



1,004 


-X864 




96 

208 

1 186 

- 8x8 

- 654 

286 

308 

1-99 

lOI 

22 

22 

— 164 


22 

22 

1*50 

106 

0 




— 

— 

— 


• col. (9) on p. 273. 
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6. English Life Table No. 7. 

These tables (one for each sex) were based on the population of 
England and Wales at the censuses of 1901 and 1911 and the 
recorded deaths in the calendar years 1901-10 inclusive. 

The populations of 1901 were supplied for each of the first four 
years of age and then in quinquennial groups 5-9,10-14, with 
a final group 100 and over. 

The 1911 figures were available for each year and were grouped 
in the same way as the 1901 figures in order that the mean population 
could be found by Waters’s method. Finally the interpolated figures 
for 100 and over were split into those for ages* 100-104 and those 
for ages 105 and over in the proportions shown by the 1911 Census 
figures. 

The deaths for the ten years were given for each of the first five 
years, then in quinquennial groups up to 24 last birthday, and then 
in decennial groups 25-34, etc. up to 75-84, with a final group 
85-99. For 1901 to 1909 the deaths of centenarians were given age 
by age and the deaths for 1910 were subdivided into the groups 
100-104 and 105 and over in the same proportion. Similarly, the 
decennial groups were split into quinquennial groups by means of 
the figures for the years 1910-12, which were available for each age 
up to 99. Using the grouping 5-9, 10-14, ••• 100-104 graduated 
quinquennial pivotal values were obtained for ages 12, 17, ... 97 
for populations and deaths and the values of were deduced. 

For osculatory interpolation log(ja.+*i) was used instead of 
and values were obtained for ages 17 to 92 inclusive. 

For each of the ages o to 4 was derived from the records of 
births and deaths, while for age 12 had already been obtained as 
a pivotal value. Using the values of q^, (not log (5'jp4--i)) for ages 
3,4,12,17 and 18 the intermediate values were found by Lagrange’s 
formula, although divided differences would have given the same 
result rather more simply. 

For old ages log^^. was the function operated on. The values for 
89, 90, 91, 92 and 9I were available and, by assuming a constant 
fourtih difference, the remaining values were easily found. 
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7. English Life Table No. 8. 

The populations at the 1911 Census were available for each age, 
as were the deaths of the years 1910-12 inclusive as far as age 99, 
although centenarians were grouped together. 

The populations were first grouped as for the 1901 Census so 
that they could be brought down to ist July 1911, the mid-point 
of the three years. 

They were then grouped for ages 4-8, 9-13, etc., so as to deal as 
effectively as possible with local mis-statements of age. The deaths 
over age 100 were split into the necessary ranges 100-103 ’‘^4 

and over by means of the deaths for 1912, which were available 
age by age. 

Having thus obtained the population and deaths for age-groups 
4-8,9-13,... 99-103, pivotal values were calculated for ages 6,11, 
16, ... 96 and Qx obtained for these ages. Osculatory interpolation 
produced the values from age 16 to 91 inclusive and \ogpx was 
calculated for ages 88, 89, 90, 91 and 96 (the last pivotal value). 
Assuming a constant fourth difference as before the table was 
completed at the high ages. 

qx was calculated from statistics of births and deaths for each of 
the first six years of life and, by using the values of qx for ages 4, 5, 
II, 16 and 17, the remaining values were inserted by Lagrange’s 
formula. 

8. EiigUshLifeTableNo.9. 

The population enumerated at the 1921 Census and the deaths 
of the years 1920-22 were available for each* age and, as births and 
deaths were available for each quarter of the calendar years in¬ 
volved, the rates of mortality for infantile ages were calculated 
more accurately than was possible before. The quinquennial 
groupings adopted were 2-6, 7-11, ... 92-96. Pivotal values of 
populations anS deaths were obtained and osculatory interpolation 
produced rates for ages 14 to 84 inclusive. 

As an alternative the crude values of qx were calculated from the 
data; quinquennial groups 5-9,10-14, were adopted to reduce 
irregularities. The same process as above was then applied to the 
grouped values of qx and the rates obtained were very similar to 
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those produced by operating on populations and deaths separately. 
For the sake of continuity these latter rates were adopted. 

The rates for ages o to 5 were calculated from the statistics of 
births and deaths. The values for ages 85 and over were obtained 
by a Gompertz graduation in which, since the values of 

\ogpx are in geonjetric progression. 

Taking = r the ratio was used to construct the 

logio />84 

successive values logio/>86> logioPse* ®tc. and the values of were 
deduced. 

For ages 6 to 13 it was assumed that 

qx = a + bx-{-lcx{x—i)-{-ldx{x—i)(x — 2 )y 
and the constants were found from the values of q^, jg, q^^ and q^^ 
already obtained. 

9. English Life Table No. 10. 

After various trials it was found that the groupings 5-9,10-14, 
were as good as any, and rates of mortality were found in the usual 
way for ages 17 to 87 inclusive. As for E.L. No. 9 a Gompertz 
curve was used to extend the table to the end of life. In this method 

^_ cologj)y^ 5 _ ^ males and 1*42 for females. 
coXogPx 

The rates for ages o to 5 \yere calculated from records of births 
and deaths. Special attention was paid to q^y for which more detailed 
data were available. 

For ages 6 to 16 special methods had to be used because of the 
rapid changes in the birth-rates after the war of 1914-18. As a result 
the values for ages 17 to 22 already obtained had to be modified to 
produce a smooth progression. The whole question was however 
one of construction rather than graduation. 

10. Advantages and disadvantages of the method. 

It is hardly possible to discuss on general lines the merits and 
demerits of the method of graduated quinquennial pivotal values 
and osculatory interpolation. The method was devised to meet a 
special problem and has proved so successful that it has been 
modified only in detail. 
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The weak graduating power of the formula renders it unsuitable 
for many purposes, particularly if a high degree of smoothness is 
essential. In assessing any method of graduation however it is 
only fair to take into account the type of data with which it was 
intended to deal. 
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EXAMPLES 10 

I. From the following data find values of for ages 42 to 67 by means 
of graduated quinquennial pivotal values and osculatory interpolation: 


Age>^group last 
birthday 

Period ist Jan. 1935 to 31st Dec. 1937 

Mean population 

Deaths 

50-34 

36,466 

294 

35-39 

'48.634 

474 • 

40-44 

55.100 

783 

45-49 

56,623 

1098 

50-54 

49.684 

1440 

55-59 

37.664 

X 749 

60-64 

24.139 

1839 

65-69 

16,511 

2043 

70-74 

11,881 

2445 


2. Without making use of the interpolated rates found in the question 
above find 447.151 at 3I per cent interest from the data of that question 
and compare it with the value based on the interpolated rates. 
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3. From the undernoted data relating to the period 1934-36 calculate 
an approximate value of using King’s short method. 


Age-group 

Mean population 

Deaths 

50-53 

68,400 

2400 

53-56 

64,200 

2940 

56-59 

59,600 

3360 


51,200 

3750 

62-65 

44,700 

4350 

65-68 

37,000 

4650 

68-71 

30,100 

4980 

71-74 

22,200 

5100 

74-77 


4800 

77-80 


4080 
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Table I. Values of the ordinates and the distribution 
function for the Normal Curve. 

The ordinate yotp (*) = 

■V 2 ir 

The area to the left of the ordinate'at the point x (the distribution 
function) is 

F(je)=-;Lr e-V^dt. 

The values are tabulated for positive values of x only; a change in 
the sign of x does not affect y while F{—x) = i—F{x). 



y=p (*) 

F(*) 

X 

$ 

II 

F ( x ) 

0*0 

0-39894 

0*50000 

2*5 

0*01753 

0-99379 

0*1 

0-39695 

0-53983 

3*6 

001358 

0*99534 

o*a 

0*39104 

0*57936 

3*7 

0*01043 

0-99653 

0-3 

0*38139 

0*61791 

3*8 

0*00793 

0-99744 

0*4 

o*368a7 

0*65543 

3*9 

0*00595 

0*99813 

0-5 

o* 35 ao 7 

0*69146 

3*0 

0*00443 

0*99865 

0-6 

o* 333 aa 

0-72575 

3*1 

0*00337 

0*99903 

0*7 

o*3iaas 

0*75804 

3*2 

0*00338 

0*99931 

0-8 

o*a8969 

0*78814 

3*3 

0*00173 

0*99952 

0-9 

o*a66o9 

0*81594 

3*4 

000133 

0*99966 

1*0 

o*a 4 i 97 

0*84134 

3*5 

0*00087 

0-99977 

i*i 

o*ai78s 

0*86433 

3*6 

0*00061 

0*99984 

rz 

0*19419 

0*88493 

3*7 

0*00043 

0*99989 1 

1*3 

0*17137 

0*90330 

3*8 

0*00039 

0*99993 

1*4 

0*14973 

0*91934 

3*9 

0*00030 

0*99995 

1-5 

o*ia95a 

0*93319 

4*0 

0*00013 

0*99997 

1*6 

0*1109a 

0*94530 

4*1 

0*00009 

0*99998 

1*7 

0*09405 

0*95543 

4*2 

0*00006 

0*99999 

1*8 

007895 

0*96407 

4*3 

0*00004 

0*99999 

1-9 

0*06563 

0*97138 

4*4 

0*00003 

0*99999 

2«o 

0*05399 

0 * 977*5 

4*5 

0*00003 


a*x 

0*04398 

0*98314 

4*6 

0*00001 


a*a 

0*03547 

0*98610 

4*7 

0*00001 


a -3 

a*4 

0*03833 

0*03339 

0*98938 

0*99180 

4*8 

0*00000 



For intermediate values second difference interpolation is usually 
sufficient al^ough on occasions third differences should be allowed 
for if great accuracy is required. 
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From the above table the following values may be obtained: 


F ( x ) 

X 

F ( x ) 

X 

•001 

—3-090 

.999 

3-090 

•005 

-2-576 

*995 

2-576 

•010 

—2*326 

.990 

2*326 

•025 

— 1*960 

•975 

1*960 

•050 

-1-645 

•950 

1-645 

•25 

-0-674 

•75 

0-674 




Table IL Table of values of corresponding to tritical values of 




Judgement 

H 

_1___ , 

Degrees of freedom 

I 

5 

10 

15 

20 

25 

30 

“ Much too probable ** 

•999 

— 

— 

— 

— 

— 

— 

— 

“Too probable” 

•99 

*0002 

•55 

2*56 

5*23 

8*26 

11*52 

14*95 

“ Rather too probable ” 

•95 

‘OO4 

1*14 

3*94 

7*26 

10*85 

14*61 

18*49 


•5 

*455 

4*35 

9*34 

14*34 

19*34 

24*34 

29*34 

“ Of doubtful improbability” 

•05 

384 

11*07 

18*31 

25*00 

31*41 

37-65 

43*77 

“ Improbable” 

•01 

6*64 

15*09 

23*21 

30-58 

37*57 

44*31 

50-89 

“ Very improbable ” 

*001 

10*83 

20*52 

29*59 

37.70 

45*32 

52*62 

59*70 


P 

35 

40 

45 

50 

55 

60 

65 

“Much too probable” 

•999 

13*61 

x6*8i 

20*12 

23*53 

27*01 

30*56 

34-18 

“Too probable” 

•99 

17*88 

21*53 

25*26 

29*06 

32*92 

36-83 

40*78 

“ Rather too probable ” 

*95 

22*19 

26*23 

30*34 

34*49 

38*68 

42*91 

47*17 


•5 

34*50 

39*50 

44*50 

49*50 

54*50 

59*50 

64*50 

“ Of doubtful improbability ” 

•05 

49*52 

55*47 

6 i -37 

67*22 

73*03 

78*80 

84*54 

“ Improbable” 

•01 

s 6-S3 

62*88 

69*15 

75*35 

81*49 

87*58 

93*63 

“Very improbable” 

•001 

64*94 

71-74 

78-43 

85*02 

91*54 

97-98 

104*37 


P 

70 

75 

80 

8s 

90 

95 

100 

“ Much too probable” 

•999 

37-84 

41*55 

45*31 

49*10 

52*93 

56*79 

6o*68 

“Too probable” 

■99 

4478 

48*81 

S*-87 

56-96 

61*08 

65*22 

69*39 

“ Rather too probable ” 

•95 

51 46 

55*77 

6o‘ii 

64-47 

68*85 

73*24 

77-65 


•5 

69-50 

74*50 

79*50 

84*50 

89-50 

94*50 

99*50 

“ Of doubtful improbability ” 

•05 

90*25 

95*93 

101*59 

107*24 

112*86 

118*47 

124*06 

“ Improbable ” 

'01 

99-63 

105*60 

111*54 

117*45 

123*33 

129*19 

135*02 

“ Very improbable ” 

*001 

110*71 

117*90 

123*24 

129*45 

135*62 

141*76 

147*87 


P 

105 

no 

115 

120 

125 

130 

135 

“ Much too probable ” 

•999 

64*60 

68*54 

72*51 

76*50 

80*51 

84*54 

88*59 

“Too probable” 

•99 

73*57 

77*78 

82*00 

86*24 

90*50 

94*77 

9905 

“ Rather too probable ” 

•95 

82*07 

86*51 

90*96 

95*42 

99*90 

104*38 

108*87 


•5 

104*50 

109*50 

114*50 

119*50 

124*50 

129*50 

134*50 

“ Of doubtful improbability ” 

•05 . 

129*63 

135*^9 

140*74 

146*28 

151*81 

157*33 

162*83 

“Improbable” 

•01 

140*84 

146*63 

152*41 

158-17 

163*91 

169*64 

175*36 

“ Very improbable ” 

•001 

153-95 

t(^’00 

166*04 

172*05 

178*04 

184*01 

189*96 


P 

X40 

145 

150 

155 

160 

165 

170 

“Much too probable” 

•999 

92*66 

96-74 

100*84 

104*95 

109*08 

113*22 

117*38 

“Too probable” 

•99 

103-35 

107*66 

111*98 

116*31 

120*66 

125*01 

129*37 

“ Rather too probable 

•95 

113*38 

117*89 

122*41 

126*94 

131*47 

136*02 

140*57 


•s 

139*50 

144*50 

149*50 

154*50 

159*50 

164*50 

169*50 

“ bt doubtful improbability ” 

•05 

168*33 

173*82 

179*30 

184*77 

190*23 

195*69 

2oi*X4 

“Improbable” 

•ot ' 

i8z^o6 

186*75 

192*43 

198*10 

203*76 

209*40 

ai5*04| 

“ Very improbable ” 

*001 

195*89 

201*81 

207*71 

213*60 

219*47 

225*33 

a3x*i7! 


. SeUt Se^l has pointed out that the values for/>30 are only approximate. Accurate 
teblet are available fbr/»40, 50, 70, 80, 90 and 100 in Biometrika, xxxii, 1941^ p. 187. 
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